Parametric estimation of multi-dimensional homeomorphic transformations

ABSTRACT

In object registration and recognition based on a set of known templates, the tremendous set of possible transformations that may relate the template and an observed signature makes any detection and recognition problem ill-defined unless this variability is taken into account. The present invention estimates the deformation that transforms some pre-chosen representation of an object (template) into the current observation. The method employs a set of non-linear operators to replace a high dimensional problem by an equivalent linear problem, expressed in terms of the unknown parameters of the transformation model. The solution is applicable to any homeomorphic transformation regardless of its magnitude. In the special case where the transformation is affine the solution is shown to be exact.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.60/518,006 filed Nov. 5, 2003 and U.S. Provisional Application No.60/604,745 filed Aug. 25, 2004.

BACKGROUND OF THE INVENTION

1. Field of Invention

The present invention relates generally to object recognition, and moreparticularly to a method to deform a template of an object into anobserved deformation of the object.

2. Summary of the Invention

Automated image registration is required whenever information obtainedfrom different views of an object needs to be combined or compared.Given two images, one is looking for a transformation, such that onetransformed image becomes similar to the second image. While anextensive amount of work has been done on this problem the fundamentalquestion of how to reliably and efficiently estimate the transformationrelating two images remains largely unsolved. The present invention isan approach aimed at solving directly this fundamental problem bymodeling the transformation as a general continuous 2-D mappingapproximated by a parametric model. It includes method for estimatingthe parameters of the transformation in a computationally efficientmanner involving a linear estimation problem rather than an extensivesearch. The solution is unique and is applicable to essentially anycontinuous transformation regardless of the magnitude of thedeformation. Once the transformation (or its inverse) has beenestimated, one can map one image onto the other, i.e. perform imageregistration and recognition.

This provides a solution to the problem of estimating the deformationbetween two images, a basic building block that can facilitate thesolution and implementation of many existing open problems including thegeneral problem of automatic recognition of objects.

In automatic recognition a measured image needs to be compared to alibrary of templates. This problem is greatly complicated by the need totake into account the deformation between the template and theobservation on the object. Using the method disclosed here, one canestimate this deformation and map the measured image so that it is readyto be matched with the template. This process will be carried out foreach template in the library.

The solution has a wide range of applications to problems of interest ina wide range of areas, such as in automatic detection and recognition ofdeformations and anomalies in medical images; in security systems whereclaimed identity has to be verified by comparing an acquired image of aperson or object to an existing database, or more challenging in systemswhere an object (such as a specific suspect) has to be identified froman input stream containing a large number of similar objects; in objectbased low bit rate image coding: most of the information on the movingobjects in the scene can be faithfully described and tracked as a set ofcontinuous transformations applied to a small set of templates providingthe object appearance from various observation angles; or in remotesensing image registration where the problem becomes especially severewhen images are taken at low angles and are therefore highly deformed bythe perspective projection.

The solution is aimed at directly solving the fundamental problem commonto this vast set of application areas and to many similar ones, namelyestimating the deformation relating any given observed signature of theobject to a pre-defined representation of it. The method provides anaccurate yet computationally very simple solution to a problem for whichexisting solutions require an extensive numerical optimization, which isnot guaranteed to provide the correct solution.

IN THE DRAWING

FIG. 1 shows a two-dimensional affine transformation of [0,1]×[0,1].

FIG. 2 shows the support of g(x,y).

FIG. 3 illustrates a graphical interpretation of two of the constraintson the solution, obtained by applying second-order-moment constraints.

FIG. 4 shows a template (bottom); an estimated deformed object obtainedby applying the deformation estimated from the observation to thetemplate (middle); and an observation on the deformed object (top).

FIG. 5 shows a template (bottom); an estimated deformed object obtainedby applying the deformation estimated from the observation to thetemplate (middle); and an observation of the deformed object (top).

FIG. 6 illustrates an image of the Mona Lisa: left, after registration,and right, before registration.

FIG. 7 shows the deformation function and statistics of theleast-squares estimate.

FIG. 8 illustrates a template (top); a noisy observation on the deformedobject (middle); and an estimated deformed object (bottom).

FIG. 9 illustrates the deformation function and it estimates.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

This invention is concerned with the general problem of objectregistration and recognition based on a set of known templates. However,while the set of templates is known, the variability associated with theobject, such as its location and pose in the observed scene, or itsdeformation are unknown a-priori, and only the group of actions causingthis variablity in the observation, can be defined. This hugevariability in the object signature (for any single object) due to thetremendous set of possible deformations that may relate the template andthe observed signature, makes any detection and recognition problemill-defined unless this variablity is taken into account. In otherwords, implicit or explicit registration of the observed objectsignature with respect to any template in an indexed set is an inherentand essential part of the solution to any detection and recognitionproblem.

The fundamental settings of the problem known as deformable templatesare discussed in the literature. There are two key elements in adeformable template representation: A typical element (the template);and a family of transformations which when applied to the typicalelement produces other elements—the possible observations. In thesimplest case transformations are rigid and are composed only oftranslation and rotation. When non-rigid transformations are consideredthe transformation group is much larger and can be represented by thegroup of homeomorphisms, or a subgroup of it. The set of transformationsforms a group action on the space of all possible transformed templates.Thus, each observation is an element of the orbit induced by the actionof the group on some pre-chosen observation on the object—the template.For multiple objects, the transformation space is best described as aunion of orbits, each representing a different object and its possibledeformations. Thus, given measurements of an observed object (forexample, in the form of an image) recognition becomes the procedure offinding the object template that minimizes some metric with respect tothe observation, either by jointly finding the group element and objecttemplate that minimize the metric, or by choosing a metric which isinvariant to the group action.

To enable a rigorous treatment of the problem we begin by defining the“similarity criterion”. Let G be a group and S be a set (a functionspace in our case), such that G acts as a transformation group on S. Theaction of G on S is defined by G×S→S such that for every φεG and everysεS, (φ,s)→s∘φ (composition of functions on the right), where s∘φεS.From this point of view, given two functions h and g on the same orbit,the initial task (that enables recognition in a second stage), is tofind the element φ in G that makes h and g identical in the sense thath=g∘φ. To simplify the discussion, at this stage we exclude the case of“self-similar” functions, i.e., functions ƒεS for which there existssome φεG such that φ is not the identity element while ƒ=ƒ∘φ.

To better illustrate the approach consider the following examples:S ₁ ={ƒ:R→R|ƒ measureable and bounded} G ₁ ={φ:R→R|φ(x)=ax+b,a≠0}  1.In this setting the objects are measurable and bounded functions fromthe real line to itself and the group is the group of linearnon-singular transformations. (Non-singularity is essential for havingthe structure of a group). Let s be an object in S₁. The group actiondefined by G₁ implies that the only transformations s can undergo areuniform scaling of its x-axis and shift. Consider now the entire family{s∘φ|φεG₁} which is the family of functions induced from the singleobject s (the orbit). Thus, given some other element in this family wewould like to know what object it is, and in what position it isrelative to s. If a time dependent sequence is provided, we would liketo track its evolvement as a function of time.S ₂ ={ƒ:R→R|ƒ measureable and bounded} G₂ ={φ:R→R|φ homeomorphism}  2.

Since G₂ is a group of homeomorphisms, the types of possibledeformations are unlimited for any practical purpose, and include theinvertible non-uniform one-dimensional transformations.S ₃ ={ƒ:R ² →R|ƒ measureable and bounded} G ₃ ={φ:R ² →R ² |φεSO₂(R){circle over (×)}R ²}  3.This is one of the simplest examples when two-dimensional domains arebeing considered. In this example the group action model only allows theobjects to be rotated or translated from their “original” position. Thusthe objects in this framework are all “rigid”.S ₄ ={ƒ:R ² →R|ƒ measureable and bounded} G ₄ ={φ:R ² →R ² |φεGL₂(R){circle over (×)}R ²}  4.This is a generalization of the previous example, yet the deformationsthe object may undergo can still be modeled as linear transformations.S ₅ ={ƒ:R ² →R|ƒ measureable and bounded} G ₅ ={φ:R ² →R ²|φhomeomorphism}  5.In this world deformations are invertible and, in general, non-uniform.The deformations are essentially not restricted, provided that theobject is not torn apart. Extensions to higher dimensional object modelsare straightforward.

Consider next the family of simple functions with triangular support,i.e., ƒεS₃ such that ƒ is one on some triangle in R² and zero otherwise.In (S₃,G₃) only rigid transformations are allowed. Thus in general, twodifferent simple functions with triangular support define two differentfunctions in S₃. Two functions are G₃-similar if and only if they arecongruent in the usual Euclidian sense. However, when the geometryinduced by (S₄,G₄) is considered, all simple functions with triangularsupport are G₄ similar since each of the lines defining the triangularsupport is mapped by a transformation in GL₂(R) to a line in R², and theintersections of these transformed lines define a triangular support.Similarly, all simple functions with parallelogram support areG₄-similar. (However simple functions with parallelogram support are notG₄-similar to simple functions with triangular support.) Yet when(S₅,G₅) is considered, all simple functions with convex support areG₅-similar.

The above setting enables formalization of practical questions using arigorous setting:

-   -   1. Motion Analysis: Is the problem of estimating the group        action as a function of time so that at each instant the correct        group element that acts on the object is identified. (In some        cases it is possible to add topological structure to the group        or even an analytic structure (Lie groups) that enables the        tracking of continuous or differentiable motions.)    -   2. Detection and Recognition: Is the task of finding the        equivalence class of the given object, thus identifying the        orbit associated with this object.

This disclosure concentrates on parametric modeling and estimation ofhomeomorphic transformations. Theoretically, in the absence of noise,the solution to the registration problem is obtained by applying each ofthe deformations in the group to the template, followed by comparing theresult to the observed realization. In the absence of noise, applicationof one of the deformations to the template yields an image, identical tothe observation. Thus the procedure of searching for the deformationthat transforms g into h is achieved, in principle, by a mapping fromthe group to the space of functions defined by the orbit of g. However,as the number of such possible deformations is infinite, this directapproach is computationally prohibitive. Hence, more sophisticatedmethods are essential.

Intuitively, it seems that the simplest way to relate any two objects,where one is assumed to be a deformed version of the other, is byassociating ordered coordinate sequences of labeled points, known aslandmarks, on the two objects. Each such landmark point is beingassociated with a specific, identifiable feature in the image domain.Thus, the shape of an object is described by the configuration of thelandmark points that are projected onto the image plane. The keyconcept, common to all such methods is that the points are labeled suchthat we know the correspondence between feature points across differentimages. Therefore, to make this approach feasible, the correspondenceproblem must be solved first. This approach may be feasible when thedeformation is close enough to the identity, the number of features isrelatively small (for combinatoric reasons), and there is a strongcontextual evidence to guide the solution to the correspondence problem.Yet, in many problems the feature points are not easily identifiable,and their number may be large—in which case the correspondence problemrapidly becomes very difficult to solve.

As already indicated above, sets of landmarks are only the simplest typeof objects that can be compared in order to evaluate the unknowndeformation. “Continuous” objects such as curves describing theboundaries of shapes, or images of surfaces are employed as well.Existing approaches for solving the deformation estimation problem fromobservations on continuous objects proceed by defining a cost functionon the space of deformations considered. This cost function penalizesboth the ‘distance’ between the deformed template and the observation,and a measure of the ‘size’ of the deformation. The aim is then to findthe deformation that minimizes the cost. Thus, the most common approachfor solving the problem of estimating the transformation parameters ismetric based. For example, in the case where the transformation isaffine this procedure is implemented by employing a search for thematrix A and vector cεR^(n) that minimize some predefined metric d onthe space of functions, i.e., (Â,ĉ)=argmin_(AεGL) _(n) _((R),cεR) _(n)d(h(x),g(Ax+c)). After choosing the metric d in a sense that has itsjustification in the concrete problem at hand, usually the solution isobtained using some optimization procedure to find the locally optimal(Â,ĉ) minimizing the distance d (h(x), g(Ax+c)). As indicted above,sometimes existence of a-priori knowledge about the specific problem,enables us to quantify the “distance” between the desired solution andthe identity transformation. Assuming for simplicity the translation iszero, this a-priori knowledge is described as a cost function (usually ametric) on GL_(n)(R), which we will denote by d_(GL) _(n) _((R)). Insuch cases the solution for the transformation matrix is formalized asthe solution to the problem of minimizing d(h(x),g(Ax))+d_(GL) _(n)_((R))(I,A) over all AεGL_(n)(R). Thus, even in the relatively simplecase where the transformation is affine, the resulting optimizationproblems are highly nonlinear and nonconvex, and the only algorithmsthat can be applied are computationally expensive numerical methods thatare not guaranteed to find the optimal solution.

In order to reduce some of the computational complexity of thesevariational methods, parametric models representing the deformation byspanning it over a set of basis function with unknown coefficients havebeen suggested. Examples include where B-splines are employed, wherepolynomial bases are adopted, and where harmonic basis functions havebeen chosen.

The main advantage of the metric-based methods over the landmarksapproach is that they avoid the need to find landmarks which, asindicated above, is a tough and not always well defined problem.Moreover, in real world applications automatic identification of thelandmarks when the deformation is far from the identity is a highlycomplicated task. On the other hand the main advantage of landmark basedmethods over metric-based continuous methods is that theoretically theywill not be attracted to a local solution. Composite methods have alsobeen suggested.

For the case where the deformation is affine, while assuming thedeformation is small and the observations differentiable, a widely usedapproach is to linearize the problem using a first order Taylor seriesexpansion. The major advantage of this approach is that in case thedeformation is indeed small, a solution to the problem of estimating theaffine transformation parameters is formulated as a solution to a systemof linear equations. This method is elaborated in Section 6, comparingit to the method disclosed in the present invention. The method of thepresent invention also provides a solution to the problem of finding thetransformation parameters by solving a system of linear equations.However, this new method, that originates from entirely differentconsiderations, is exact and not approximate, while being applicable todeformations of any size. Moreover the deformations are not restrictedto be affine but rather belong to the much larger class of homeomorphicdeformations.

Spectral domain approaches for estimating the affine transformationparameters have also been investigated. These methods first transformthe pair of object images into the Fourier plane, where the relationbetween the absolute values of the Fourier transforms is affine but isindependent of translations, as translations are related only to thephase. This introduction only briefly classifies and reviews the“mainstream” registration methods.

The method of the present invention, although it employs the continuousnature of the information in the observations on both the template andobject, cannot be cataloged into any of the above outlines. The centerof the solution of the present invention is a method to replace the highdimensional and computationally intensive problem of evaluating theorbit created by applying to any given template in the database thewhole set of transformations in the group, by an equivalent linearproblem expressed in terms of the unknown parameters of thetransformation model. In this setting, the problem of finding theparametric model of the deformation is mapped, by a set on non-linearoperators, into a set of linear equations which is then solved for thetransformation parameters. The solution is unique and exact and isapplicable to any homeomorphic transformation regardless of themagnitude of the deformation.

For the case where the transformation is affine, this solution isfurther extended to include the case where the deformation relating theobserved signature of the object and the template, is composed both ofthe geometric deformation due to the affine transformation of thecoordinate system and a constant illumination change.

The structure of Part A of the disclosure below is as follows: It beginsby rigorously defining the scope of the problem of finding the affinetransformation parameters, given an observation h and a template g of aplanar object, where the two are known to be related through an affinetransformation. To simplify the notations, the description in Section 2assumes that the translation is null and considers the problem offinding AεGL_(n)(R), given the observations on h and g. Then, Section 3uses a least squares approach to solve the problem of estimating theparameters of the affine transformation model, for the case where themodel is only an approximation of the true physical distortion. InSection 4 the algorithm for finding the affine transformation parametersis extended so that A, and the translation vector are jointly estimated.Section 5 further extends the framework of the model and considers thecase where h(x)=ag(Ax+c), where a is an unknown illumination gain sothat both a, A and c are unknown and need to be determined. Thederivation in the previous sections is further extended in Section 6 tohandle the case where the affine transformation changes with time, whilethe given template g is fixed in time. Sections 7, 8 and 9 consider theproblem of estimating the parameters of the affine transformation whenthe observation is subject to an additive noise. In Section 7 the leastsquare estimator is derived and is further analyzed for a specificchoice of the employed nonlinear operators. Section 8 derives the firstand second order statistics of the proposed solution, assuming thesignal to noise ratio is high. Under this assumption, Section 9 thenderives a maximum likelihood estimation algorithm for the affinetransformation parameters. In Section 10 presents some numericalexamples to illustrate the operation and robust performance of theproposed parameter estimation algorithm, for a broad range of affinedeformations.

Part A: Affine Transformations 1. Estimation of Multidimensional AffineTransformations

1.1. The Two-Dimensional Affine Transformation

We begin by defining the geometry of the transformation for the casewhere the transformed objects are two-dimensional. Extensions to then-dimensional case are immediate, as we show throughout. Morespecifically, let D be a compact subset of R² and let g:D→R be aLebesgue measurable and bounded function. Let also GL₂(R) denote thegroup of real valued invertible 2×2 linear transformations. Let A besome matrix in GL₂(R) and let s be a two-dimensional vector representingthe shift operation. Applying the transformation A to every (x,y)εDfollowed by shifting the result by s, defines a new compact subset ofR², that we denote by D_(A). FIG. 1 illustrates the result of applyingsuch a transformation, with s=0, to the basis vectors[1,0]^(T),[0,1]^(T), and the result of applying the affinetransformation A to the domain D=[0,1]×[0,1], where the transformedbasis vectors are now given by [a₁₁,a₂₁]^(T),[a₁₂,a₂₂]^(T),respectively.

In the following we will be interested in obtaining the affinetransformation parameters, given the “surface shape” g:D→R, and thesurface shape, h:D_(A)→R. which is the shape of the same surface, asmeasured after the coordinate transformation. The next subsectionprovides a rigorous definition of the more general n-dimensionalproblem.

1.2. Estimation of Multidimensional Affine Transformations: ProblemDefinition

The basic problem addressed in this disclosure is the following:Giventwo bounded, Lebesgue measurable functions h, g with compact supports,and with no affine symmetry (as rigorously defined below), such thath:R^(n)→Rg:R^(n)→Rwhereh(x)=g(Ax),AεGL _(n)(R),xεR ^(n)  (1)find the matrix A. Thus, to simplify the notations we first assume thattranslation is null and consider the problem of finding A, given theobservations on h and g. In Section 4 the algorithm for finding theaffine transformation parameters is extended so that A, and thetranslation vector are jointly estimated. In Section 5 we further extendthe framework of the model and consider the case where h(x)=ag(Ax+c),where a is an unknown illumination gain so that both a, A and c areunknown and need to be estimated.

Let M (R^(n),R) denote the space of compact support, bounded, andLebesgue measurable functions from R^(n) to R. Let x,c be vectors inR^(n). Let N⊂M(R^(n),R) denote the set of measurable functions with anaffine symmetry (or affine invariance), i.e.,N={ƒεM(R^(n),R)|∃AεGL_(n)(R),cεR,(A,c)≠(I,0)suchthatf(x)=ƒ(Ax+c) forevery xεR^(n)}. (Thus, N is the set of functions in M(R^(n),R), stableunder at least one element of the affine group. Obviously for any ƒεNand A,c such that ƒ(x)=ƒ(Ax+c) for every xεR^(n), A,c cannot be uniquelyrecovered). Note however that functions with compact support are nottranslation nor scale invariant. Hence the scope of the above affinesymmetry definition is in fact wider than the one encountered whendiscussion is limited to functions with compact support.

To illustrate the notion of N consider the following example (although ƒdoes not have a compact support): Let ${f(x)} = \left\{ \begin{matrix}{{1x} \in {\left\lbrack {2^{n},{2^{n} + 2^{n - 1}}} \right\rbrack{\forall{n \in Z}}}} \\{0\quad{else}}\end{matrix} \right.$

This function satisfies the relation ƒ(x)=ƒ(2x) but clearly there is nohope to make any distinction between ƒ(A₁x)=ƒ(2A₁x)∀A₁εR. Hence, ƒεN.

Other examples of affine invariant functions include any constantfunction defined on all of R^(n); any periodic function defined on allof R^(n); and in the two dimensional case, the functions with radialsymmetry (as SO₂(R)⊂GL₂(R)).

Assume c=0. Let${{M_{Aff}\left( {R^{n},R} \right)}\overset{\Delta}{=}{M\left( {R^{n},R} \right)}},$N denote the set of compact support and bounded Lebesgue measurablefunctions with no affine symmetry. Next, partition M_(Aƒƒ)(R^(n),R) intoaffine equivalence sets by the equivalence relation h

g

∃AεGL_(n)(R)|h(x)=g(Ax). (It can be easily checked that this is indeedan equivalence relation). Denote the quotient space by Q_(Aƒƒ)(R^(n),R),and let [g] denote the equivalence set of g.

Lemma 1. Let gεM_(Aƒƒ)(R^(n),R). Assume h(x)=g(Ax). Then thetransformation AεGL_(n)(R) satisfying the relation h(x)=g(Ax), isunique. Proof. Assume the statement is false. Hence, there exist twotransformations A and B where A≠B, such that h(x)=g(Ax) and h(x)=g(Bx).However, h(x)=g(Bx) implies g(x)=h(B⁻¹x)=g(AB⁻¹x) where AB⁻¹≠I. Hence,gεN which is a contradiction.

Thus, an equivalent statement of the estimation problem, stated at theopening of this section is as follows: Given any two objectsh,gεM_(Aƒƒ)(R^(n),R) that satisfy an affine relation AεGL_(n)(R), i.e.,[h]=[g]=q, for some qεQ_(Aƒƒ)(R^(n),R), find A.

The direct approach for solving the problem of finding the parameters ofthe unknown transformation AεGL_(n)(R) is to apply the set of allpossible affine transformations, (i.e., every element of GL_(n)(R)), tothe given template g, thus evaluating the orbit of g which representsthe action of the group on the template. Since h and g are affinerelated, one of the points on the orbit represents the action of thedesired group element A. Nevertheless, since A is an n×n matrix it isclear that implementation of such a search on the orbit requires asearch over an n²-dimensional manifold embedded in an infinitedimensional function space, which is infeasible.

This disclolsure shows that the problem of finding the parameters of theunknown affine transformation, whose direct solution requires a highlycomplex search in a function space, can be formulated as a parameterestimation problem. Moreover, it is shown that the original problem canbe formulated in terms of an equivalent problem which is expressed inthe form of a linear system of equations in the unknown parameters ofthe affine transformation. A solution of this linear system of equationsprovides the unknown transformation parameters. In the next section weshow how the problem of finding the parametric model of the affinedeformation can be transformed using a set on non-linear operators intoa set of linear equations which is then solved for the transformationparameters.

2. An Algorithmic Solution

In this section we specify the conditions, and provide a constructiveproof showing that given an observation on h(x)εM_(Aƒƒ)(R^(n),R) and anobservation on g(X)εM_(Aƒƒ)(R^(n), R) where h(x)=g (Ax), A can beuniquely determined. Moreover, it is shown that almost always thesolution for the unknown parameters of the affine transformation isobtained by solving only a linear system of equations. Let, x,yεR^(n),i.e., $\begin{matrix}{{x = \left\lbrack {x_{1},x_{2},\ldots,x_{n}} \right\rbrack^{T}}{y = \left\lbrack {y_{1},y_{2},\ldots,\quad y_{n}} \right\rbrack^{T}}{{Thus},{y = {Ax}},{x = {A^{- 1}y}}}{where}{A = {{\begin{pmatrix}a_{11} & \cdots & a_{1n} \\\vdots & \ddots & \vdots \\a_{n1} & \cdots & a_{nn}\end{pmatrix}\quad A^{- 1}} = \begin{pmatrix}q_{11} & \cdots & q_{1n} \\\vdots & \ddots & \vdots \\q_{n1} & \cdots & q_{nn}\end{pmatrix}}}} & (2)\end{matrix}$

Since AεGL_(n)(R), also A⁻¹εGL_(n)(R). It is therefore possible to solvefor A⁻¹ and the solution for A is guaranteed to be in GL_(n)(R).Moreover, as shown below, in the proposed procedure the transformationdeterminant is evaluated first, and by a different procedure than theone employed to estimate the elements of A⁻¹. Hence, a non-zero Jacobianguarantees the existence of an inverse to the transformation matrix.

Let ƒεM_(Aƒƒ)(R^(n),R) and let μ_(n) denote the Lebesgue measure onR^(n). Define the notation ∫_(R^(″))  f = ∫_(R^(″))  f  𝕕μ_(n)

Note that in the following derivation it is assumed that the functionsare bounded and have compact support, as they are measurable but notnecessarily continuous. It is further assumed that AεGL_(n)(R) has apositive determinant.

The first step in the solution is to find the determinant of the matrixA. A simple approach is to evaluate the Jacobian through the identityrelation: $\begin{matrix}{{\int_{\quad R^{''.}}^{\quad}{h^{2}(x)}} = {{\int_{R^{''}}^{\quad}{g^{2}({Ax})}} = {{A^{- 1}}{\int_{R^{''}}^{\quad}{g^{2}(y)}}}}} & (3)\end{matrix}$or through similar identities. Hence, $\begin{matrix}{{A^{- 1}} = \frac{\int_{R^{''}}^{\quad}{h^{2}\quad(x)}}{\int_{R^{''}}{g^{2}(y)}}} & (4)\end{matrix}$and |A⁻¹|=|A|⁻¹. In the second stage we prove that, provided that g is“rich” enough in a sense we rigorously define below, A⁻¹ can be uniquelyestimated by establishing n linear and independent constrains on the nelements in each of its rows. More specifically, let (A⁻¹)_(k) denotethe k th row of A⁻¹. We then have $\begin{matrix}{{\int_{\quad R^{''.}}^{\quad}{x_{k}{h(x)}}} = {{\int_{\quad R^{''.}}^{\quad}{x_{k}{g({Ax})}}} = {{{A^{- 1}}{\int_{\quad R^{''.}}^{\quad}{\left( {\left( A^{- 1} \right)_{k}y} \right){g(y)}}}} = {{A^{- 1}}{\sum\limits_{i = 1}^{n}\quad{q_{ki}{\int_{\quad R^{''.}}^{\quad}{y_{i}{g(y)}}}}}}}}} & (5)\end{matrix}$

To solve for {q_(ki)}_(i=1) ^(n), more constrains must be added. Towardsthis goal, apply a family of Lebesgue measurable, left-hand compositions{w_(l)}:R→R to the known relation h(x)=g(Ax), to yieldw _(l) ∘h(x)=w _(l) ∘g(Ax)  (6)Integrating over both sides of the equality in (6), similarly to (5) weobtain $\begin{matrix}{{\int_{\quad R^{''.}}^{\quad}{x_{k}{w_{l} \circ {h(x)}}}} = {{{A^{- 1}}{\int_{\quad R^{''.}}^{\quad}{\left( {\left( A^{- 1} \right)_{k}y} \right){w_{l} \circ {g(y)}}}}} = {{A^{- 1}}{\sum\limits_{i = 1}^{n}\quad{q_{ki}{\int_{\quad R^{''.}}^{\quad}{y_{i}{w_{l} \circ {g(y)}}}}}}}}} & (7)\end{matrix}$which yields when written in a matrix form the system $\begin{matrix}{{\begin{pmatrix}{\int_{\quad R^{''.}}^{\quad}{y_{1}\left( {w_{1} \circ {g(y)}} \right)}} & \cdots & {\int_{\quad R^{''.}}^{\quad}{y_{n}\left( {w_{1} \circ {g(y)}} \right)}} \\{\quad\vdots} & \ddots & \vdots \\{\int_{\quad R^{''.}}^{\quad}{y_{1}\left( {w_{n} \circ {g(y)}} \right)}} & \cdots & {\int_{\quad R^{''.}}^{\quad}{y_{n}\left( {w_{n} \circ {g(y)}} \right)}}\end{pmatrix}\begin{pmatrix}q_{k1} \\\vdots \\q_{kn}\end{pmatrix}} = \begin{pmatrix}{{A}{\int_{\quad R^{''.}}^{\quad}{x_{k}\left( {w_{1} \circ {h(x)}} \right)}}} \\\vdots \\{{A}{\int_{\quad R^{''.}}^{\quad}{x_{k}\left( {w_{n} \circ {h(x)}} \right)}}}\end{pmatrix}} & (8)\end{matrix}$

Similar system of equations is solved for each k to obtain the entirematrix A⁻¹ and thus A itself.

We have just proved the following theorem:

Theorem 1. Let AεGL_(n)(R). Assume h,gεM_(Aƒƒ)(R^(n),R) such thath(x)=g(Ax). Then, given measurements of h and g, A can be uniquelydetermined if there exists a set of measurable functions {w_(l)}_(l=1)^(n) such that the matrix $\begin{matrix}{G = \begin{pmatrix}{\int_{\quad R^{''.}}^{\quad}{y_{1}\left( {w_{1} \circ {g(y)}} \right)}} & \cdots & {\int_{\quad R^{''.}}^{\quad}{y_{n}\left( {w_{1} \circ {g(y)}} \right)}} \\\vdots & \ddots & \vdots \\{\int_{\quad R^{''.}}^{\quad}{y_{1}\left( {w_{n} \circ {g(y)}} \right)}} & \cdots & {\int_{\quad R^{''.}}^{\quad}{y_{n}\left( {w_{n} \circ {g(y)}} \right)}}\end{pmatrix}} & (9)\end{matrix}$is full rank.

Remark: Note that the denominator of (4), as well as the elements of thematrix G depend only on the template and its coordinate system and thushave to be evaluated only once. In fact the denominator of (4) togetherwith the matrix G represent all the information in the template,required for finding the affine transformation parameters. Thus thedenominator of (4) together with G form a “sufficient representation” ofthe template (similarly to the notion of sufficient statistics), so thatthe template itself is not needed for solving the estimation problemonce G and the denominator of (4) have been evaluated.

Remark: It should be noted that although we use the term “estimation”throughout this disclosure, the solution in (8) for the affinetransformation parameters is exact and is not an estimate in the usualsense of the word.

Remark: The application of a set {w_(l)}_(l=1) ^(n) to g(y) yielding (9)is in fact a mapping from the space of compact support, bounded andmeasurable functions to the space of n×n matrices. Recall that any n×nmatrix can be considered as a vector in R^(n) ² . As the set of singularn×n matrices is of measure zero in R^(n) ² (see Lemma 2 in Appendix A)it can be shown that unless g itself is linearly non-informative, i.e.,the first order moments of w_(p)∘g vanish for any w_(p) (as is the casein the example below), there will always (in the almost sure sense)exist a set {w_(l)}_(l=1) ^(n) generating a non-singular matrix (9), andhence a solution for the elements of A⁻¹. In other words, for any g,which is “rich” enough there will always exist a set {w_(l)}_(l=1) ^(n)generating a non-singular matrix.

Remark: Note that the solution for A employs only zero (the Jacobian)and first order constraints (obtained by multiplying w_(l)∘h by x_(k))and avoids the use of higher order moments. However, imposing such arestriction (which is clearly convenient due to its simplicity) mayresult in cases where a system of the type (9) does not exist, asillustrated by the following example. It is then obvious that higherorder moments are needed to obtain a system similar to (9) (yetnonlinear) with enough equations to solve for all the unknowns.

Example: This example illustrates some of the difficulties andambiguities of the problem and its solution, and hence the limitationsof the proposed method relative to the inherent ambiguities of theproblem. Consider the function${g\left( {x,y} \right)} = \left\{ \begin{matrix}60 & {{x},{{{y} \leq 1};{{xy} \geq 0}}} \\0 & {otherwise}\end{matrix} \right.$

The support of the function is depicted in FIG. 2.

In this example we shall restrict ourselves to a subgroup of GL₂(R)where each transformation in this subgroup is an upper triangularinvertible matrix i.e., $Q = {\left\{ {{\begin{pmatrix}a & c \\0 & b\end{pmatrix}{{ab}}} \neq 0} \right\}.}$Next, consider an embedding of Q in R³ where the coordinate axes are a,b, c (this choice of Q enables us an easy graphical interpretation inthe 3-D space). Thus, the condition ab≠0 implies that Q is all of R³except the planes a=0,b=0

Let S_(g) denote the set of elements in Q that result in affine symmetryfor the given function g (i.e., the set of elements in Q such thatg(x,y)=g(ax+cy,by)). Due to the definition of g, only two such symmetryconditions exist: g(x,y)=g(−x,−y) and g(x,y)=g(y,x). However since thetransformation g(x,y)=g(y,x) cannot be produced by the assumed groupstructure, we have that $S_{g} = {\left\{ {\begin{pmatrix}1 & 0 \\0 & 1\end{pmatrix},\begin{pmatrix}{- 1} & 0 \\0 & {- 1}\end{pmatrix}} \right\}.}$Note that for g to have no affine symmetry w.r.t. to the action of Q,the subgroup Q has to be redefined, for example as$Q = {\left\{ {{\begin{pmatrix}a & c \\0 & b\end{pmatrix}a},{b > 0}} \right\}.}$

To illustrate the concept and the performance of the proposed method,let us consider the case where the observation is in fact identical tog, i.e., the coordinate transformation between g and the observation his the identity.

Thus the problem is to find φ, where g and h are given, and it is knownthat g and h are related through the relation h=g∘φ, φεQ.

Applying the Jacobian condition, (4), we obtain the first constraint thesolution for φ must satisfy: $\begin{matrix}{{\int{\int_{R^{2}}^{\quad}{{h^{2}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {120 = {\int{\int_{R^{2}}^{\quad}{{g^{2}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}}}} & \quad \\{{and}\quad{hence}} & \quad \\{{ab} = 1.} & (10)\end{matrix}$

Since for any transformation in Q, x=au+cv and y=vb, application of thefirst-order-moment constrains, (7), yields (as the Jacobian was found in(10) to be equal to 1): $\begin{matrix}{{\int{\int_{R^{2}}^{\quad}{{{xh}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {\int{\int_{R^{2}}^{\quad}{\left( {{a\quad u} + {cv}} \right){g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} & (11) \\{\quad{= {{a{\int{\int_{R^{2}}^{\quad}{{{ug}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} + {c{\int{\int_{R^{2}}^{\quad}{{{vg}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}}}} & \quad \\{{\int{\int_{R^{2}}^{\quad}{{{yh}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {\int{\int_{R^{2}}^{\quad}{({bv}){g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} & \quad \\{\quad{= {b{\int{\int_{R^{2}}^{\quad}{{{vg}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}}} & \quad \\{however} & \quad \\{{\int{\int_{R^{2}}^{\quad}{{{yh}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {0 = {\int{\int_{R^{2}}^{\quad}{{{vg}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}} & \quad \\{{\int{\int_{R^{2}}^{\quad}{{{xh}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {0 = {\int{\int_{R^{2}}^{\quad}{{{ug}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}} & \quad\end{matrix}$and hence first-order-moment constraints of the type (7) do not provideany information on the constraints a, b, c must satisfy. Moreover, sinceg is an even function, the matrix G will always be null, regardless ofthe chosen composition function w_(i) (as the w_(i)'s operate on the setof levels the function assumes and not on its support), and thereforethe necessary condition in Theorem 1, cannot be satisfied. We thereforeconclude that g is linearly non-informative and that higher ordermoments must be considered. Considering second order moments we have$\begin{matrix}{{\int{\int_{R^{2}}^{\quad}{x^{2}{h\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {\int{\int_{R^{2}}^{\quad}{\left( {{a\quad u} + {cv}} \right)^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} \\{{= {{a^{2}{\int{\int_{R^{2}}^{\quad}{u^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} + {2a\quad c{\int{\int_{R^{2}}^{\quad}{{{uvg}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} + {c^{2}{\int{\int_{R^{2}}^{\quad}{v^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}}},} \\{{\int{\int_{R^{2}}^{\quad}{{{xyh}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {\int{\int_{R^{2}}^{\quad}{\left( {{a\quad u} + {cv}} \right)({bv}){g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} \\{= {{{ab}{\int{\int_{R^{2}}^{\quad}{{{uvg}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} + {{bc}{\int{\int_{R^{2}}^{\quad}{v^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}}} \\{and} \\{{\int{\int_{R^{2}}^{\quad}{y^{2}{h\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {\int{\int_{R^{2}}^{\quad}{({bv})^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}} \\{\quad{= {b^{2}{\int{\int_{R^{2}}^{\quad}{v^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}}}\end{matrix}$where this time we have $\begin{matrix}{{\int{\int_{R^{2}}^{\quad}{y^{2}{h\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {40 = {\int{\int_{R^{2}}^{\quad}{v^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}} \\{{\int{\int_{R^{2}}^{\quad}{x^{2}{h\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {40 = {\int{\int_{R^{2}}^{\quad}{u^{2}{g\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}} \\{{\int{\int_{R^{2}}^{\quad}{{{xyh}\left( {x,y} \right)}{\mathbb{d}x}{\mathbb{d}y}}}} = {30 = {\int{\int_{R^{2}}^{\quad}{{{uvg}\left( {u,v} \right)}{\mathbb{d}u}{\mathbb{d}v}}}}}}\end{matrix}$Substituting into the above moment equalities we have40=b²4030=30ab+40bc40=a ²40+60ac+40c ²which is equivalent tob=±1  (12)3=3ab+4bc  (13)2=a ²2+3ac+2c ².  (14)

The constraint b=±1 describes two planes in R³, the constraint2=2a²+3ac+2c² describes a manifold in R³, while each of (10) and (13)describes two manifolds in R³. (See the graphical interpretation of theconstraints (12) and (14) in FIG. 3.) It can be easily verified thatindeed the only real valued solutions of the system (10)-(14) are givenby a=1,b=1,c=0 and a=−1,b=−1,c=0 which is the correct solution,providing the elements of S_(g).

We have thus described a simple GL_(n)(R) problem where first ordermoments are not enough to obtain a solution for the parameters of thedeformation, but higher order moments constrain the solution to theinherent ambiguity of the problem. This implies that even for a non-richfunction when high-order moments are considered, a solution isachievable.

Remark: Note that in the absence of noise it is possible to generalizethe above solution, while keeping it computationally simple, byconsidering as independent parameters, functions of the model parameters(such as q_(k1) ²) that will appear when second- or higher order momentsare employed. One can then solve a linear problem expressed in terms ofthe functions of the problem parameters, thus collecting moreinformation than obtained when only first order moments are employed,for simplicity, in evaluating G.

3. Estimation in the Presence of Model Mismatch

In general, it may happen that there exists a mismatch between theassumed model and the physical one, for example in the presence of noiseor when the transformation is not affine but close to it in some senseso that it is desired to best approximate the deformation by an affinetransformation. Thus, following the above solution, additionalconstraints can be added by considering additional compositions{w_(l)}_(l=1) ^(p):R→R with p≧n. Thus (3) is replaced by $\begin{matrix}{{{\int_{R^{n}}^{\quad}{w_{k} \circ {h(x)}}} \approx {\int_{R^{n}}^{\quad}{w_{k} \circ {g({Ax})}}}} = {{A^{- 1}}{\int_{R^{n}}^{\quad}{w_{k} \circ {g(y)}}}}} & (15)\end{matrix}$

Thus, (15) produces the overdetermined system $\begin{matrix}{\begin{pmatrix}{\int_{R^{n}}^{\quad}{w_{1} \circ {h(x)}}} \\\vdots \\{\int_{R^{n}}^{\quad}{w_{p} \circ {h(x)}}}\end{pmatrix} \approx {{A^{- 1}}\begin{pmatrix}{\int_{R^{n}}^{\quad}{w_{1} \circ {g(y)}}} \\\vdots \\{\int_{R^{n}}^{\quad}{w_{p} \circ {g(y)}}}\end{pmatrix}}} & (16)\end{matrix}$which by a least squares solution provides an estimate for |A|.Similarly, (8) is replaced by the overdetermined system $\begin{matrix}{{\begin{pmatrix}{\int_{R^{n}}^{\quad}{y_{1}{w_{1} \circ {g(y)}}}} & \cdots & {\int_{R^{n}}^{\quad}{y_{n}{w_{1} \circ {g(y)}}}} \\\vdots & ⋰ & \vdots \\{\int_{R^{n}}^{\quad}{y_{1}{w_{p} \circ {g(y)}}}} & \cdots & {\int_{R^{n}}^{\quad}{y_{n}{w_{p} \circ {g(y)}}}}\end{pmatrix}\begin{pmatrix}q_{k1} \\\vdots \\q_{kn}\end{pmatrix}} \approx \begin{pmatrix}{{A}{\int_{R^{n}}^{\quad}{x_{k}{w_{1} \circ {h(x)}}}}} \\\vdots \\{{A}{\int_{R^{n}}^{\quad}{x_{k}{w_{p} \circ {h(x)}}}}}\end{pmatrix}} & (17)\end{matrix}$and the solution for the entries of A becomes a linear LS solution.

Remark: Since the set of singular n×n matrices is of Lebesgue measurezero in R^(n) ² , almost every n×n matrix is rank-n. Hence, augmentingthe matrix (9) with additional rows by employing p>n constraints yieldson the L.H.S. of (17) a matrix which is almost-surely rank-n (unless gitself is linearly non-informative), and hence it can be shown thatthere will always (in the almost sure sense) exist a least squaressolution for the elements of A⁻¹. Invoking Lemma 2 again, this time tothe estimate of A⁻¹, we conclude that almost surely the estimated matrixis non-singular and hence in GL_(n)(R), as required.

4. Finding the Affine Transformation Parameters in the Presence ofTranslation

In this section we extend the solution presented in Section 2 byconsidering the more general problem where the observed object issubject to an affine transformation which includes an unknowntranslation. In this case the goal is to find the parameters of theaffine transformation including the translation. The transformationmodel is thus given byy=Ax+c,x=A ⁻¹ y+b  (18)where x,y,A,A⁻¹ are defined as in Section 4, while c and b=−A⁻¹c aren-dimensional vectors of unknown constants, each representing thetranslation along a different axis, in the coordinate transformationmodel and its inverse, respectively. More specifically let b=[q₁₀, q₂₀,. . . , q_(n0)]^(T). Define y₀=1 and let {tilde over (y)}=[y_(0, y) ₁, .. . , y_(n]) ^(T). Hence, using (18)x=T{tilde over (y)}  (19)where T is an n×(n+1) matrix given by $T = \begin{pmatrix}q_{10} & q_{11} & \ldots & q_{1n} \\\vdots & \quad & ⋰ & \vdots \\q_{n0} & q_{n1} & \ldots & q_{nn}\end{pmatrix}$

The Jacobian of the transformation (19) is the same as in the case wherethere is no translation (analyzed in Section 4) and hence it is foundusing (3)-(4). The evaluation of T is performed by finding the n+1elements of each of its n rows by constructing an n+1 dimensional systemof linear equations, in a similar way to the procedure in Section 4.More specifically, applying a family of Lebesgue measurable, left-handcompositions {w_(l)}:R→R to the known relation h(x)=g(Ax+c) andintegrating over both sides of the equality, similarly to (5) we obtain$\begin{matrix}{{{{\int_{R^{n}}{x_{k}{w_{l} \circ {h(x)}}}} = {{A^{- 1}}{\int_{R^{n}}{\left( {T_{k}\overset{\sim}{y}} \right){w_{l} \circ {g\left( \overset{\sim}{y} \right)}}}}}}\quad = {{A^{- 1}}{\sum\limits_{i = 0}^{n}\quad{q_{ki}{\int_{R^{n}}{{\overset{\sim}{y}}_{i}{w_{l} \circ {g\left( \overset{\sim}{y} \right)}}}}}}}}\quad} & (20)\end{matrix}$and in a matrix form, since g(y)=g({tilde over (y)}), $\begin{matrix}{{\begin{pmatrix}{\int_{R^{n}}\left( {w_{1} \circ {g(y)}} \right)} & {\int_{R^{n}}{y_{1}\left( {w_{1} \circ {g(y)}} \right)}} & \ldots & {\int_{R^{n}}{y_{n}\left( {w_{1} \circ {g(y)}} \right)}} \\\vdots & ⋰ & \vdots & \quad \\{\int_{R^{n}}\left( {w_{n + 1} \circ {g(y)}} \right)} & {\int_{R^{n}}{y_{1}\left( {w_{n + 1} \circ {g(y)}} \right)}} & \ldots & {\int_{R^{n}}{y_{n}\left( {w_{n + 1} \circ {g(y)}} \right)}}\end{pmatrix}\begin{pmatrix}q_{k0} \\q_{k1} \\\vdots \\q_{kn}\end{pmatrix}} = \begin{pmatrix}{{A}{\int_{R^{n}}{x_{k}\left( {w_{1} \circ {h(x)}} \right)}}} \\\vdots \\{{A}{\int_{R^{n}}{x_{k}\left( {w_{n + 1} \circ {h(x)}} \right)}}}\end{pmatrix}} & (21)\end{matrix}$

Similar system of equations is solved for each k to obtain all theelements of T. Hence we have the following:Theorem 2. Let AεGL_(n)(R). Assume h,gεM_(Aƒƒ)(R^(n),R) such thath(x)=g(Ax+c). Given measurements of h and g, then A and c can beuniquely determined if there exists a set of Lebesgue measurablefunctions {w_(l)}_(l=1) ^(n+1) such that the matrix $\begin{matrix}\begin{pmatrix}{\int_{R^{n}}\left( {w_{1} \circ {g(y)}} \right)} & {\int_{R^{n}}{y_{1}\left( {w_{1} \circ {g(y)}} \right)}} & \ldots & {\int_{R^{n}}{y_{n}\left( {w_{1} \circ {g(y)}} \right)}} \\\vdots & ⋰ & \vdots & \quad \\{\int_{R^{n}}\left( {w_{n + 1} \circ {g(y)}} \right)} & {\int_{R^{n}}{y_{1}\left( {w_{n + 1} \circ {g(y)}} \right)}} & \ldots & {\int_{R^{n}}{y_{n}\left( {w_{n + 1} \circ {g(y)}} \right)}}\end{pmatrix} & (22)\end{matrix}$is full rank.

Remark: As in the previous case, the elements of the matrix (22) dependonly on the template and its coordinate system and thus have to beevaluated only once. Therefore, the denominator of (4) together with thematrix (22) represent all the information in the template, required forfinding the affine transformation parameters including the translation.Hence, the denominator of (4) together with the matrix (22) form asufficient representation of the template.

Remark: We note that in case there exists a translation, i.e., c≠0, butthe translation is considered a nuisance parameter, and hence there isno need to estimate the translation of the observed object relative tothe template, the {w_(i)} can be chosen such that∫_(R^(n))w_(i) ∘ g(x) ≡ 0,which also implies ∫_(R^(n))w_(i) ∘ h(x) ≡ 0.This results in invariance of the solution for A to translations. Inthis case only n equations are required to estimate the affinetransformation, exactly as obtained in Section 2 assuming thetranslations of both the template and the observation relative to theorigin of the coordinate system are zero.

Remark: As in Section 3, if the model is only an approximate one due tomodel mismatch, (4) and the system (21) are replaced by theircorresponding least squares versions obtained by employing a set ofcompositions {w_(l)}_(l=1) ^(p):R→R with p≧n+1.

4.1. Discussion

As indicated earlier, for the case where the affine deformation can beassumed small and the observed functions are differentiable, a widelyused approach is to linearize the problem using a first order Taylorseries expansion.

Because superficially both the solution proposed in this paper and theapproximate linearized solution posses similar structures, as they bothend-up solving a linear system of equations for the affinetransformation parameters, we briefly explain in this subsection theapproximate solution and discuss its disadvantages relative to themethod proposed here.

Suppose the deformation is small enough so that a first order Taylorseries expansion approximates the deformation function well enough,i.e., h(x,y)≈g(x+(ax+by+c),y+(dx+ey+ƒ)), where (ax+by+c) and (dx+ey+ƒ)are small enough such that for g itself $\begin{matrix}\begin{matrix}{\begin{matrix}{g\left( {{x + \left( {{ax} + {by} + c} \right)},} \right.} \\\left. {y + \left( {{dx} + {ey} + f} \right)} \right)\end{matrix} \cong {{g\left( {x,y} \right)} + {\left( {{ax} + {by} + c} \right)\frac{\partial g}{\partial x}\left( {x,y} \right)} +}} \\{\left( {{dx} + {ey} + f} \right)\frac{\partial g}{\partial y}\left( {x,y} \right)}\end{matrix} & (23)\end{matrix}$One may next define a cost function between the observation and theapproximating deformed template: $\begin{matrix}\begin{matrix}{{V\left( {a,b,c,d,e,f} \right)} = {\sum\limits_{({x,y})}\quad\left\lbrack {{h\left( {x,y} \right)} - {g\left( {{x + \left( {{ax} + {by} + c} \right)},} \right.}} \right.}} \\{\left. \left. {y + \left( {{dx} + {ey} + f} \right)} \right) \right\rbrack^{2}.}\end{matrix} & (24)\end{matrix}$Substitution of (23) into (24) yields $\begin{matrix}{{V\left( {a,b,c,d,e,f} \right)} = {\sum\limits_{({x,y})}\quad\left\lbrack {{h\left( {x,y} \right)} - {g\left( {x,y} \right)} - {\left( {{ax} + {by} + c} \right)\frac{\partial g}{\partial x}\left( {x,y} \right)} -} \right.}} \\\left. {\left( {{dx} + {ey} + f} \right)\frac{\partial g}{\partial y}\left( {x,y} \right)} \right\rbrack^{2}\end{matrix}$Let${I_{x} = {\frac{\partial g}{\partial x}\left( {x,y} \right)}},{I_{y} = {\frac{\partial g}{\partial y}\left( {x,y} \right)}},{I_{d} = {{g\left( {x,y} \right)} - {{h\left( {x,y} \right)}.}}}$Minimizing V(a, b, c, d, e, f) by taking its partial derivatives withrespect to the six transformation parameters and equating each result tozero, one obtains $\begin{matrix}{\frac{\partial V}{\partial a} = {{\sum{2\left( {{h\left( {x,y} \right)} - {g\left( {x,y} \right)} - {\left( {{ax} + {by} + c} \right)\frac{\partial g}{\partial x}\left( {x,y} \right)} - {\left( {{dx} + {ey} + f} \right)\frac{\partial g}{\partial y}\left( {x,y} \right)}} \right)x\frac{\partial g}{\partial x}\left( {x,y} \right)}} = 0}} & (25)\end{matrix}$and similarly for the derivatives with respect to the remainingparameters. Reorganization of the six equations of the form (25) resultsin the following system of linear constraints: $\begin{matrix}{\begin{pmatrix}{\sum{{xI}_{x}x_{x}}} & {\sum{{yI}_{x}{xI}_{x}}} & {\sum{I_{x}{xI}_{x}}} & {\sum{{xI}_{y}{xI}_{x}}} & {\sum{{yI}_{y}{xI}_{x}}} & {\sum{I_{y}{xI}_{x}}} \\{\sum{{xI}_{x}{yI}_{x}}} & {\sum{{yI}_{x}{yI}_{x}}} & {\sum{I_{x}{yI}_{x}}} & {\sum{{xI}_{y}{yI}_{x}}} & {\sum{{yI}_{y}{yI}_{x}}} & {\sum{I_{y}{yI}_{x}}} \\{\sum{{xI}_{x}I_{x}}} & {\sum{{yI}_{x}I_{x}}} & {\sum{I_{x}I_{x}}} & {\sum{{xI}_{y}I_{x}}} & {\sum{{yI}_{y}I_{x}}} & {\sum{I_{y}I_{x}}} \\{\sum{{xI}_{x}{xI}_{y}}} & {\sum{{yI}_{x}{xI}_{y}}} & {\sum{I_{x}{xI}_{y}}} & {\sum{{xI}_{y}{xI}_{y}}} & {\sum{{yI}_{y}{xI}_{y}}} & {\sum{I_{y}{xI}_{y}}} \\{\sum{{xI}_{x}{yI}_{y}}} & {\sum{{yI}_{x}{yI}_{y}}} & {\sum{I_{x}{yI}_{y}}} & {\sum{{xI}_{y}{yI}_{y}}} & {\sum{{yI}_{y}{yI}_{y}}} & {\sum{I_{y}{yI}_{y}}} \\{\sum{{xI}_{x}I_{y}}} & {\sum{{yI}_{x}I_{y}}} & {\sum{I_{x}I_{y}}} & {\sum{{xI}_{y}I_{y}}} & {\sum{{yI}_{y}I_{y}}} & {\sum{I_{y}I_{y}}}\end{pmatrix}{\quad{\begin{pmatrix}a \\b \\c \\d \\e \\f\end{pmatrix} = \begin{pmatrix}{\sum{I_{d}{xI}_{x}}} \\{\sum{I_{d}{yI}_{x}}} \\{\sum{I_{d}I_{x}}} \\{\sum{I_{d}{xI}_{y}}} \\{\sum{I_{d}{yI}_{y}}} \\{\sum{I_{d}I_{y}}}\end{pmatrix}}}} & (26)\end{matrix}$

Provided that the matrix is invertible, a solution for the affinetransformation parameters is obtained.

Thus in case the deformation is indeed small, an approximate linearsolution to the problem of estimating the affine transformationparameters is formulated. On the other hand, the linear solutionproposed in this paper is unique, exact and is applicable to any affinetransformation regardless of the magnitude of the deformation. Moreover,while in the presence of noise the new method suggested in this paperallows for an increase in the number of equations which leads to a LSsolution for the affine transformation parameters, the linearizedapproximate solution is based on only six equations, and no additionallinear constraints on the parameters can be added to the system. Alsonote that the linearized approximation requires the evaluation of theimage spatial derivatives. This evaluation is highly problematic sincethe observed objects are usually not continuous nor differentiableeverywhere. This restriction becomes even more problematic in thepresence of noise. On the other hand, the method derived in this paperemploys only integration of the observed data and hence is much morerobust.

5. Finding the Affine Transformation Subject to a Spatially ConstantIlumination Gain

In the analysis carried out so far it has been assumed that there is noillumination variation between the template and the observation, andhence the observed deformation is only due to the geometric distortionof the coordinate system caused by the affine transformation. In thissection we generalize the proposed solution and address the more generaldeformation model where the model given by (1) is replaced byh(x)=ag(Ax+c),AεGL _(n)(R), x,cεR ^(n) ,aεR,a>0  (27)where a, A and c are unknown and need to be determined. As we prove inthis section, the problem of finding the constant illumination gainamounts to replacing the step in which the Jacobian of thetransformation is being determined in the case where there is noillumination change between the observation and the template, by a stepin which both the illumination gain and the Jacobian are jointlydetermined. More specifically, since h(x)=ag(Ax+c), we have$\begin{matrix}{{\int_{R^{n}}^{\quad}{h^{2}(x)}} = {{a^{2}{\int_{R^{n}}^{\quad}{{A^{- 1}}{g^{2}(y)}}}} = {a^{2}{A^{- 1}}{\int_{R^{n}}^{\quad}{g^{2}(y)}}}}} & (28)\end{matrix}$Similarly, $\begin{matrix}{{{\int_{R^{n}}^{\quad}{h^{4}(x)}} = {a^{4}{A^{- 1}}{\int_{R^{n}}^{\quad}{g^{4}(y)}}}}{{Hence},}} & (29) \\{{{{A^{- 1}}a^{2}} = \frac{\int_{R^{n}}^{\quad}{h^{2}(x)}}{\int_{R^{n}}^{\quad}{g^{2}(y)}}}{and}} & (30) \\{{{A^{- 1}}a^{4}} = \frac{\int_{R^{n}}^{\quad}{h^{4}(x)}}{\int_{R^{n}}^{\quad}{g^{4}(y)}}} & (31)\end{matrix}$

Thus, both the Jacobian, |A⁻¹|, and the illumination gain, a, can beevaluated using (30)-(31). In the second stage n linear and independentconstraints on the type (7) (or n+1 linear and independent constraintsof the type (20), if translation is to be estimated) on the elements ofthe inverse transformation matrix must be set. For generality, weconsider the case where translation is estimated as well. Applying afamily of Lebesgue measurable, left-hand compositions {w_(l)}:R→R to theknown relation 1/ah(x)=g(Ax+c) and integrating over both sides of theequality, similarly to (20) we obtain $\begin{matrix}{{\int_{R^{n}}^{\quad}{x_{k}{w_{l} \circ \left( \frac{h(x)}{a} \right)}}} = {{{A^{- 1}}{\int_{R^{n}}^{\quad}{\left( {T_{k}\overset{\sim}{y}} \right){w_{l} \circ {g\left( \overset{\sim}{y} \right)}}}}} - {{A^{- 1}}{\sum\limits_{i = 0}^{n}\quad{q_{ki}{\int_{R^{n}}^{\quad}{{\overset{\sim}{y}}_{i}w_{l}{{circg}\left( \overset{\sim}{y} \right)}}}}}}}} & (32)\end{matrix}$and in a matrix form, since g(y)=g({tilde over (y)}), $\begin{matrix}{{\begin{pmatrix}{\int_{R^{n}}^{\quad}\left( {w_{1} \circ {g(y)}} \right)} & {\int_{R^{n}}^{\quad}{y_{1}\left( {w_{1} \circ {g(y)}} \right)}} & \cdots & {\int_{R^{n}}^{\quad}{y_{n}\left( {w_{1} \circ {g(y)}} \right)}} \\\vdots & ⋰ & \vdots & \quad \\{\int_{R^{n}}^{\quad}\left( {w_{n + 1} \circ {g(y)}} \right)} & {\int_{R^{n}}^{\quad}{y_{1}\left( {w_{n + 1} \circ {g(y)}} \right)}} & \cdots & {\int_{R^{n}}^{\quad}{y_{n}\left( {w_{n + 1} \circ {g(y)}} \right)}}\end{pmatrix}\begin{pmatrix}q_{k0} \\q_{k1} \\\vdots \\q_{kn}\end{pmatrix}} = \begin{pmatrix}{{A}{\int_{R^{n}}^{\quad}{x_{k}\left( {w_{1} \circ \frac{h(x)}{a}} \right)}}} \\\vdots \\{{A}{\int_{R^{n}}^{\quad}{x_{k}\left( {w_{n + 1} \circ \frac{h(x)}{a}} \right)}}}\end{pmatrix}} & (33)\end{matrix}$

Thus, once the illumination gain and the Jacobian have been evaluated,the R.H.S. of (33) is known, and the solution for the elements of the kth row of T is obtained. Hence we have,

Theorem 3. Let AεGL_(n)(R). Assume h,gεM_(Aƒƒ)(R^(n),R) such thath(x)=ag(Ax+c), and a is an unknown real gain coefficient. Givenmeasurements of h and g, then A, c and a can be uniquely determined ifthere exists a set of Lebesgue measurable functions {w_(l)}_(l=1) ^(n+1)such that the matrix (22) is full rank.

Remark: Similarly to the previous cases, where illumination is assumedfixed, the denominators of (30) and (31), as well as the elements of thematrix (22) depend only on the template and its coordinate system andthus have to be evaluated only once. Therefore, the denominators of(30), (31) together with the matrix (22) represent all the informationin the template, required for finding the affine transformationparameters including the translation, in the case where the illuminationgain is unknown. Hence, the denominators of (30) and (31) together withthe matrix (22) form a sufficient representation of the template.

Remark: As in Section 5, if the model is only an approximate one due tomodel mismatch, the system (30)-(3 1) as well as the system (33) arereplaced by their corresponding least squares versions obtained byemploying a set of compositions {w_(l)}_(l=1) ^(p):R→R with p≧n+1.

6. The Algorithmic Solution for a Time Dependent Evolution

In this section we extend the derivation in the previous sections to thecase where the affine transformation changes with time, while the giventemplate g(·) is fixed in time. The set of observations in this case isthe time sequence h(x,t). Thus, the problem statement is as follows: LetA(t)εGL_(n)(R) for every t. Assume that gεM_(Aƒƒ)(R^(n),R) and that forevery t, h(t)εM_(Aƒƒ)(R^(n),R) such thath(x,t)=g(A(t)x)  (34)Thus, given h(t) and g, find A(t).Let {e_(i)(t):R→R} be a set of linearly independent functions in L₂(R)(e.g., polynomials, trigonometric functions). Assume that the timedependence of A⁻¹ is parameterized as $\begin{matrix}{{A^{- 1}(t)} = {\begin{pmatrix}{\sum\limits_{i = 1}^{L}\quad{a_{11}^{i}{e_{i}(t)}}} & \cdots & {\sum\limits_{i = 1}^{L}\quad{a_{1n}^{i}{e_{i}(t)}}} \\\vdots & ⋰ & \vdots \\{\sum\limits_{i = 1}^{L}\quad{a_{n1}^{i}{e_{i}(t)}}} & \cdots & {\sum\limits_{i = 1}^{L}\quad{a_{nn}^{i}{e_{i}(t)}}}\end{pmatrix} = {{\sum\limits_{i = 1}^{L}{\begin{pmatrix}a_{11}^{i} & \cdots & a_{i1n}^{i} \\\vdots & ⋰ & \vdots \\a_{n1}^{i} & \cdots & a_{nn}^{i}\end{pmatrix}{e_{i}(t)}\quad t}} \in R}}} & (35)\end{matrix}$The problem then is to determine the above sequence of L matrices.Define the following n-dimensional product e_(i) ₁ (t)e_(i) ₂ (t) . . .e_(i) _(n) (t) where the indices i₁, i₂, . . . , i_(n)ε{1, . . . , L}and let{e_(i) ₁ (t)e_(i) ₂ (t) . . . e_(i) _(n) (t)}_(i) ₁ _(, i) ₂_(, . . . , i) _(n) ₌₁ ^(L)be the set of all such products. Clearly there are at most LU differentelements in this set. Let Q be the number of linearly independentelement in this set. To simplify the notation we shall refer to this setusing the notation {ƒ_(k)(t)}_(k=1) ^(Q). By definition, the determinantof A⁻¹(t) in (35) has the form $\begin{matrix}{{{A^{- 1}}(t)} = {\sum\limits_{k = 1}^{Q}{\mathbb{d}_{k}{f_{k}(t)}}}} & (36)\end{matrix}$Extending the previous derivations to the time varying case we have$\begin{matrix}{{\int_{R^{n}}^{\quad}{w_{p} \circ {h\left( {x,t} \right)}}} = {{\int_{R^{n}}^{\quad}{w_{p} \circ {g\left( {{A(t)}x} \right)}}} = {{A^{- 1}}(t){\int_{R^{n}}^{\quad}{w_{p} \circ {g(y)}}}}}} & (37)\end{matrix}$Substituting (36) into (37) we have $\begin{matrix}{{\int_{R^{n}}^{\quad}{{w_{p} \circ h}\left( {x,t} \right)}} = {\sum\limits_{k = 1}^{Q}{{\mathbb{d}_{k}{f_{k}(t)}}{\int_{R^{n}}^{\quad}{w_{p} \circ {g(y)}}}}}} & (38)\end{matrix}$Let [t₁, . . . , t_(v)]^(T) denote the vector of time samples and let{w_(i}) _(i=1) ^(p) be the sequence of chosen left-compositions. Then,$\begin{matrix}{\begin{pmatrix}{\int_{R^{n}}{w_{1} \circ {h\left( {x,t_{1}} \right)}}} \\\vdots \\{\int_{R^{n}}{w_{1} \circ {h\left( {x,t_{v}} \right)}}} \\{\int_{R^{n}}{w_{2} \circ {h\left( {x,t_{1}} \right)}}} \\\vdots \\{\int_{R^{n}}{w_{2} \circ {h\left( {x,t_{v}} \right)}}} \\\vdots \\{\int_{R^{n}}{w_{P} \circ {h\left( {x,t_{v}} \right)}}}\end{pmatrix} = {\begin{pmatrix}{{f_{1}\left( t_{1} \right)}{\int_{R^{n}}{w_{1} \circ {g(y)}}}} & {{f_{2}\left( t_{1} \right)}{\int_{R^{n}}{w_{1} \circ {g(y)}}}} & \cdots & {{f_{Q}\left( t_{1} \right)}{\int_{R^{n}}{w_{1} \circ {g(y)}}}} \\\vdots & \quad & \quad & \vdots \\{{f_{1}\left( t_{v} \right)}{\int_{R^{n}}{w_{1} \circ {g(y)}}}} & {{f_{2}\left( t_{v} \right)}{\int_{R^{n}}{w_{1} \circ {g(y)}}}} & \cdots & {{f_{Q}\left( t_{v} \right)}{\int_{R^{n}}{w_{1} \circ {g(y)}}}} \\{{f_{1}\left( t_{1} \right)}{\int_{R^{n}}{w_{2} \circ {g(y)}}}} & \quad & \quad & {{f_{Q}\left( t_{1} \right)}{\int_{R^{n}}{w_{2} \circ {g(y)}}}} \\\vdots & \cdots & \quad & \vdots \\{{f_{1}\left( t_{v} \right)}{\int_{R^{n}}{w_{2} \circ {g(y)}}}} & {{f_{2}\left( t_{v} \right)}{\int_{R^{n}}{w_{2} \circ {g(y)}}}} & \cdots & {{f_{Q}\left( t_{v} \right)}{\int_{R^{n}}{w_{2} \circ {g(y)}}}} \\\vdots & \quad & \quad & \vdots \\{{f_{1}\left( t_{v} \right)}{\int_{R^{n}}{w_{p} \circ {g(y)}}}} & {{f_{2}\left( t_{v} \right)}{\int_{R^{n}}{w_{p} \circ {g(y)}}}} & \cdots & {{f_{Q}\left( t_{v} \right)}{\int_{R^{n}}{w_{p} \circ {g(y)}}}}\end{pmatrix}\begin{pmatrix}d_{1} \\d_{2} \\\vdots \\d_{Q}\end{pmatrix}}} & (39)\end{matrix}$By choosing the {w_(i}) _(i=1) ^(p) such that at least Q rows of thismatrix are linearly independent|A|(t) is found by a LS solution.Clearly, if the model is accurate only P=Q equations are required.

Having obtained the Jacobian of the transformation we next determine theentries of A⁻¹(t) for each of its rows and for all tε{t₁, . . . ,t_(v)}. $\begin{matrix}\begin{matrix}{{\int_{R^{n}}{x_{k}{w_{p} \circ {h\left( {x,t} \right)}}}} = {\int_{R^{n}}{x_{k}{w_{p} \circ {g\left( {{A(t)}x} \right)}}}}} \\{= {{{A^{- 1}(t)}}{\int_{R^{n}}\left( \left( {\left( {{A^{- 1}(t)}_{k}y} \right){w_{p} \circ {g(y)}}} \right) \right.}}} \\{= {{{A^{- 1}(t)}}{\int_{R^{n}}\left( {\left( {\sum\limits_{j = 1}^{n}{\left( {\sum\limits_{l = 1}^{L}{a_{kj}^{i}{e_{i}(t)}}} \right)y_{j}}} \right){w_{p} \circ {g(y)}}} \right)}}} \\{= {\sum\limits_{j = 1}^{n}\left( {{{A^{- 1}(t)}}\left( {\sum\limits_{i = 1}^{L}{a_{kj}^{i}{e_{i}(t)}}} \right){\int_{R^{n}}{y_{j}{w_{p} \circ {g(y)}}}}} \right)}} \\{= {\sum\limits_{i = 1}^{L}{\sum\limits_{j = 1}^{n}{a_{kj}^{i}\left( {{e_{i}(t)}{{A^{- 1}(t)}}{\int_{R^{n}}{y_{j}{w_{p} \circ {g(y)}}}}} \right)}}}}\end{matrix} & (40)\end{matrix}$which yields when expressed in a matrix form $\begin{matrix}{\begin{pmatrix}{\int_{R^{n}}{x_{k}{w_{1} \circ {h\left( {x,t_{1}} \right)}}}} \\{\int_{R^{n}}{x_{k}{w_{1} \circ {h\left( {x,t_{2}} \right)}}}} \\\vdots \\{\int_{R^{n}}{x_{k}{w_{1} \circ {h\left( {x,t_{v}} \right)}}}} \\{\int_{R^{n}}{x_{k}{w_{2} \circ {h\left( {x,t_{1}} \right)}}}} \\\vdots \\{\int_{R^{n}}{x_{k}{w_{p} \circ {h\left( {x,t_{v}} \right)}}}}\end{pmatrix} = {\left\lbrack {G_{t}^{1},G_{t}^{2},\ldots\quad,G_{t}^{L}} \right\rbrack\begin{pmatrix}a_{k1}^{1} \\a_{k2}^{1} \\\vdots \\a_{kn}^{1} \\a_{kn}^{2} \\\vdots \\a_{kn}^{L}\end{pmatrix}}} & (41) \\{G_{t}^{i} = \begin{pmatrix}{{e_{i}\left( t_{1} \right)}{A^{- 1}}\left( t_{1} \right)\left( {\int_{R^{n}}{y_{1}{w_{1} \circ {g(y)}}}} \right)} & \cdots & {{e_{i}\left( t_{1} \right)}{A^{- 1}}\left( t_{1} \right)\left( {\int_{R^{n}}{y_{n}{w_{1} \circ {g(y)}}}} \right)} \\{{e_{i}\left( t_{2} \right)}{A^{- 1}}\left( t_{2} \right)\left( {\int_{R^{n}}{y_{1}{w_{1} \circ {g(y)}}}} \right)} & \cdots & {{e_{i}\left( t_{2} \right)}{A^{- 1}}\left( t_{2} \right)\left( {\int_{R^{n}}{y_{n}{w_{1} \circ {g(y)}}}} \right)} \\\vdots & ⋰ & \vdots \\{{e_{i}\left( t_{v} \right)}{A^{- 1}}\left( t_{v} \right)\left( {\int_{R^{n}}{y_{1}{w_{p} \circ {g(y)}}}} \right)} & \cdots & {{e_{i}\left( t_{v} \right)}{A^{- 1}}\left( t_{v} \right)\left( {\int_{R^{n}}{y_{n}{w_{p} \circ {g(y)}}}} \right)}\end{pmatrix}} & (42)\end{matrix}$Hence,

Theorem 4. Let A(t)εGL_(n)(R) for every t. Assume that for every t,h(t),gεM_(Aƒƒ)(R^(n),R) such that h(x)=g(A(t)x). Then given measurementsof h(t) and g, A(t) can be uniquely determined if there exists a set ofmeasurable functions {w_(l)}_(l=1) ^(p) such that the matrix[G_(t) ¹,G_(t) ², . . . , G_(t) ^(L)]  (43)is full rank.

7. Observations Subject to Additive Noise: The Least Squares Solution

In the presence of noise the observed data is given byh(x)=g(Ax)+η(x).  (44)Assume that the noise is stationary, zero mean and uncorrelated suchthat its higher order moments are known, where we adopt the notationσk=E[η^(k)(x)]. We first address questions related to the issue of theoptimal choice of the set {w_(p)} for each template function g. We beginby adapting the solution derived in the previous sections for thedeterministic case, to a least squares solution for the modelparameters. In the presence of noise the basic equation (7) becomes$\begin{matrix}{{\int_{R^{n}}{x_{k}{w_{p} \circ {h(x)}}}} = {\int_{R^{n}}{x_{k}{w_{p} \circ \left\lbrack {{g\left( {A\quad x} \right)} + {\eta(x)}} \right\rbrack}}}} & (45)\end{matrix}$Let ε_(p,k) ^(g) denote the error term in each equation due to thepresence of the noise, i.e., $\begin{matrix}{ɛ_{p,k}^{g} = {\int_{R^{n}}{x_{k}\left\{ {{w_{p} \circ \left\lbrack {{g\left( {A\quad x} \right)} + {\eta(x)}} \right\rbrack} - {w_{p} \circ {g\left( {A\quad x} \right)}}} \right\}}}} & (46)\end{matrix}$Following a similar procedure to the one developed in the previoussections, we obtain the linear system of equations $\begin{matrix}\begin{matrix}{{\int_{R^{n}}{x_{k}{w_{p} \circ h}(x)}} = {{{A^{- 1}}{\int_{R^{n}}{\left( {\left( A^{- 1} \right)_{k}y} \right){w_{p} \circ {g(y)}}}}} + ɛ_{p,k}^{g}}} \\{= {{{A^{- 1}}{\sum\limits_{i = 1}^{n}{q_{ki}{\int_{R^{n}}{y_{i}{w_{p} \circ g}(y)}}}}} +}} \\{{{ɛ_{p,k}^{g}p} = 1},\ldots\quad,P}\end{matrix} & (47)\end{matrix}$The system (47) represents a linear regression problem where the noisesequence {ε_(p,k) ^(g)} is, in general, non-stationary since itsstatistics depend on the choice of w_(p) for each p. The regressors arefunctions of w_(p) and the template g, and hence are deterministic.Provided that the sequence of composition functions {w_(p)}_(p=1) ^(p)is chosen such that the resulting regressors matrix is full rank, thesystem (47) is solved by a linear least squares method such that the l₂norm of the noise vector is minimized, exactly as in (17), while |A⁻¹|is evaluated using (16).

The dependence of the noise sequence {ε_(p,k) ^(g)} on the choice ofw_(p) suggests that different choices of the composition sequence{w_(p)}_(p=1) ^(P) may provide different solutions. We shall be firstinterested in systems such that for each p, the linear constraintimposed by w_(p) is such that the “effective noise” that corresponds toeach w_(p) is zero mean.

7.1. Construction of Linear Constraints with Zero-Mean Error Terms

Consider the case where we choose w_(p)(x)=Σ_(k)α_(k) ^(p)x^(k). We nextevaluate the mean term of the Jacobian and the mean term of each of thelinear constraints, (47), on the entries of A⁻¹. Thus, correction termscan be introduced such that the non-zero-mean error terms are replacedby zero mean error terms. To simplify the notation we will takeadvantage of the linear structure of w_(p)(x), and analyze only the casewhere w_(p)(x)=x^(p) and the generalization is straightforward. Thus, inthis case $\begin{matrix}\begin{matrix}{{\int_{R^{n}}{h^{p}(x)}} = {\int_{R^{n}}\left( {{g\left( {A\quad x} \right)} + {\eta(x)}} \right)^{p}}} \\{= {\int_{R^{n}}{\sum\limits_{i = 0}^{p}{\begin{pmatrix}{p(84)} \\{i(85)}\end{pmatrix}{g^{i}\left( {A\quad x} \right)}\left( {\eta(x)} \right)^{p - i}}}}} \\{= {{A^{- 1}}{\sum\limits_{i = 0}^{p}{\begin{pmatrix}{p(86)} \\{i(87)}\end{pmatrix}{\int_{R^{n}}{{g^{i}(y)}\left( {\eta\left( {A^{- 1}(y)} \right)} \right)^{p - i}}}}}}}\end{matrix} & (48)\end{matrix}$Evaluating the expected values of both sides of (88) we have$\begin{matrix}\begin{matrix}{{E\left\lbrack {\int_{R^{n}}{h^{p}(x)}} \right\rbrack} = {E\left\lbrack {{A^{- 1}}{\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}\left( {{g^{i}(y)}\left( {\eta\left( {A^{- 1}(y)} \right)} \right)^{p - i}} \right)}}}} \right\rbrack}} \\{= {{A^{- 1}}{\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{{g^{i}(y)}\sigma_{p - i}}}}}}} \\{= {{{A^{- 1}}{\int_{R^{n}}{g^{p}(y)}}} + {{A^{- 1}}{\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{{g^{i}(y)}\sigma_{p - i}}}}}}}}\end{matrix} & (49)\end{matrix}$where the first term in the sum is precisely the term obtained in theabsence of noise and with w_(p)(x)=x^(p), while${A^{- 1}}{\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{{g^{i}(y)}\sigma_{p - i}}}}}$is an added mean term due to the noise contribution.Let ε_(p) ^(J,g) denote the error term in the Jacobian constraintequation, due to the presence of the noise for the choicew_(p)(x)=x^(p), i.e., $\begin{matrix}\begin{matrix}{ɛ_{p}^{J,g} = {\int_{R^{n}}\left\{ {\left\lbrack {{g({Ax})} + {\eta(x)}} \right\rbrack^{p} - {g^{p}({Ax})}} \right\}}} \\{= {{{A^{- 1}}{\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{{g^{i}(y)}\sigma_{p - i}}}}}} + {\overset{\sim}{ɛ}}_{p}^{J,g}}}\end{matrix} & (50)\end{matrix}$where {tilde over (ε)}_(p) ^(J,g) is a zero mean random variable. We cantherefore rewrite (48) as $\begin{matrix}{{\int_{R^{n}}{h^{p}(x)}} = {{{A^{- 1}}\left( {{\int_{R^{n}}{g^{p}(y)}} + {\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{{g^{i}(y)}\sigma_{p - i}}}}}} \right)} + {\overset{\sim}{ɛ}}_{p}^{J,g}}} & (51)\end{matrix}$The system (101) represents a linear regression problem in |A⁻¹| wherethe observation noise is zero mean. The regressors are functions ofw_(p), the template g, and the known statistics of the noise. Hence theregressors are deterministic. The system (101) is solved by a linearleast squares method such that Σ_(p=1) ^(P)|{tilde over (ε)}_(p)^(J,g)|² is minimized.

Following a similar procedure we next obtain zero-mean linearconstraints on the entries of A⁻¹: $\begin{matrix}\begin{matrix}{{\int_{R^{n}}{x_{k}{h^{p}(x)}}} = {\int_{R^{n}}{x_{k}\left( {{g({Ax})} + {\eta(x)}} \right)}^{p}}} \\{= {\int_{R^{n}}{x_{k}{\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}{g^{i}({Ax})}\left( {\eta(x)} \right)^{p - i}}}}}} \\{= {\int_{R^{n}}{{A^{- 1}}\left( {\sum\limits_{j = 1}^{n}{q_{kj}y_{j}}} \right){\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}{g^{i}(y)}\left( {\eta\left( {A^{- 1}(y)} \right)} \right)^{p - i}}}}}} \\{= {{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left( {\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}\left( {\int_{R^{n}}{y_{j}\left( {{g^{i}(y)}\left( {\eta\left( {A^{- 1}(y)} \right)} \right)^{p - i}} \right)}} \right)}} \right)}}}}\end{matrix} & (52)\end{matrix}$Evaluating the expected values of both sides of (52) we have$\begin{matrix}\begin{matrix}{{E\left\lbrack {\int_{R^{n}}{x_{k}{h^{p}(x)}}} \right\rbrack} = {E\left\lbrack {{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left( {\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}\left( {\int_{R^{n}}{y_{j}\left( {{g^{i}(y)}\left( {\eta\left( {A^{- 1}(y)} \right)} \right)^{p - i}} \right)}} \right)}} \right)}}} \right\rbrack}} \\{= {{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left( {\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{y_{j}{g^{i}(y)}\sigma_{p - i}}}}} \right)}}}} \\{= {{{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}{\int_{R^{n}}{y_{j}{g^{p}(y)}}}}}} + {{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left( {\sum\limits_{i = 0}^{p}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{y_{j}{g^{i}(y)}\sigma_{p - i}}}}} \right)}}}}}\end{matrix} & (53)\end{matrix}$where the first term in the sum is precisely the term on the R.H.S. of(7) obtained in the absence of noise and with w_(p)(x)=x^(p), while${A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left( {\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{y_{j}{g^{i}(y)}\sigma_{p - i}}}}} \right)}}$is an added mean term due to the noise contribution.

Hence for the choice of w_(p)(x)=x^(p), ε_(p,k) ^(g) defined in (46) hasthe form $\begin{matrix}\begin{matrix}{ɛ_{p,k}^{g} = {\int_{R^{n}}{x_{k}\left\{ {\left\lbrack {{g({Ax})} + {\eta(x)}} \right\rbrack^{p} - {g^{p}({Ax})}} \right\}}}} \\{= {{{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left( {\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{y_{j}{g^{i}(y)}\sigma_{p - i}}}}} \right)}}} + {\overset{\sim}{ɛ}}_{p,k}^{g}}}\end{matrix} & (54)\end{matrix}$where the last equality is due to (7) and (52), while {tilde over(ε)}_(p,k) ^(g) is a zero mean random variable. Thus, (52) can bewritten in the form $\begin{matrix}{{\int_{R^{n}}{x_{k}{h^{p}(x)}}} = {{{A^{- 1}}{\sum\limits_{j = 1}^{n}{q_{kj}\left\{ {{\int_{R^{n}}{y_{j}{g^{p}(y)}}} + {\sum\limits_{i = 0}^{p - 1}{\begin{pmatrix}p \\i\end{pmatrix}{\int_{R^{n}}{y_{j}{g^{i}(y)}\sigma_{p - i}}}}}} \right\}}}} + {\overset{\sim}{ɛ}}_{p,k}^{g}}} & (55)\end{matrix}$Thus the system (55) represents a linear regression problem, differentfrom (47), where the observation noise is non-stationary, but with azero mean. The regressors are functions of w_(p), the template g, andthe known statistics of the noise. Hence the regressors aredeterministic. Provided that the resulting regressors matrix is fullrank, the system (55) is solved by a linear least squares method suchthat Σ_(p=1) ^(P)|{tilde over (ε)}_(p,k) ^(g)|² is minimized.

It should be noted however that the noise sequence {tilde over(ε)}_(p,k) ^(g) is only approximately zero-mean, since in (53) it isassumed that |A⁻¹| is a deterministic constant. However, in theestimation procedure itself it is replaced by its estimate obtained fromsolving (51), which is a function of the noise sequence q(x).

8. Analysis of the High SNR Case

In this section we analyze the proposed method assuming that the signalto noise ratio is high. Let σ² denote the noise variance. In thefollowing we restrict our attention to continuous and differentiable (atleast twice) composition functions w_(p)(x):R→R. Expanding w_(p)∘h(x)into a Taylor series about g(Ax), and employing the high SNR assumptionto neglect the contribution of high noise powers we havew _(p) ∘h(x)≈w _(p) ∘g(Ax)+η(x)w′ _(p) ∘g(Ax)  (56)where${w_{p}^{\prime}(x)} = {\frac{\mathbb{d}{w_{p}(x)}}{\mathbb{d}x}.}$Hence, the error term in the “moment” equation, (46), is approximatedunder the high SNR assumption by $\begin{matrix}{{{\overset{\sim}{ɛ}}_{p,k}^{g} = {{\int_{R^{n}}{x_{k}{\eta(x)}{w_{p}^{\prime} \circ {g\left( {A\quad x} \right)}}p}} = 1}},\ldots\quad,P} & (57)\end{matrix}$Clearly, E[{overscore (ε)}_(p) ^(g)]=0, p=1, . . . , P. We next evaluatethe error covariances of the system, under the high SNR assumption. Letε_(k) ^(g)=[{overscore (ε)}_(l,k) ^(g), . . . , {overscore (ε)}_(P,k)^(g)]^(T), ε^(g)=[(ε₁ ^(g))^(T), . . . , (ε_(n) ^(g)) ^(T)]^(T) andΓ=E[ε^(g)(ε^(g))^(T)]. Thus, the (r,s) element of the P×P dimensional,(k,l) block of Γ, denoted Γ^(k,l) is given by $\begin{matrix}\begin{matrix}{{\Gamma^{k,l}\left( {r,s} \right)} = {E\left\lbrack {\int_{R^{n}}{x_{k}{\eta(x)}{w_{r}^{\prime} \circ {g({Ax})}}{\int_{R^{n}}{z_{1}{\eta(z)}{w_{s}^{\prime} \circ {g({Az})}}}}}} \right\rbrack}} \\{= {\sigma^{2}{\int_{R^{n}}{x_{k}x_{l}{w_{r}^{\prime} \circ {g({Ax})}}{w_{s}^{\prime} \circ {g({Ax})}}}}}} \\{= {\sigma^{2}{A^{- 1}}{\int_{R^{n}}{{\left\lbrack {\left( A^{- 1} \right)_{k}y} \right\rbrack\left\lbrack {\left( A^{- 1} \right)_{l}y} \right\rbrack}{w_{r}^{\prime} \circ {g(y)}}{w_{s}^{\prime} \circ {g(y)}}}}}} \\{= {\sigma^{2}{A^{- 1}}{\sum\limits_{i = 1}^{n}{\sum\limits_{j = 1}^{n}{q_{ki}q_{lj}{\int_{R^{n}}{y_{i}y_{j}{w_{r}^{\prime} \circ {g(y)}}{w_{s}^{\prime} \circ {g(y)}}}}}}}}}\end{matrix} & (58)\end{matrix}$Rewriting (58) in matrix form we have $\begin{matrix}{\Gamma^{k,l} = {\sigma^{2}{A^{- 1}}{\sum\limits_{i = 1}^{n}{\sum\limits_{j = 1}^{n}{q_{ki}q_{lj}u_{i,j}\quad{where}}}}}} & (59) \\{u_{i,j} = \begin{pmatrix}{\int_{R^{n}}{\int_{R^{n}}{y_{i}y_{j}{w_{1}^{\prime} \circ {g(y)}}{w_{1}^{\prime} \circ {g(y)}}}}} & \cdots & {\int_{R^{n}}{y_{i}y_{j}{w_{1}^{\prime} \circ {g(y)}}{w_{p}^{\prime} \circ {g(y)}}}} \\\vdots & {\quad ⋰} & \vdots \\{\int_{R^{n}}{y_{i}y_{j}{w_{p}^{\prime} \circ {g(y)}}{w_{1}^{\prime} \circ {g(y)}}}} & \cdots & {\int_{R^{n}}{y_{i}y_{j}{w_{p}^{\prime} \circ {g(y)}}{w_{p}^{\prime} \circ {g(y)}}}}\end{pmatrix}} & (60)\end{matrix}$

EXAMPLE

Consider the case where w_(p)(x)=x^(p). In this caseh ^(p() x)≈g ^(p)(Ax)+pη(x)g ^(p−1)(Ax)  (61)and the error term in the “moment” equation, (57) becomes$\begin{matrix}{{{\overset{\_}{ɛ}}_{p,k}^{8} = {{p{\int_{R^{n}}^{\quad}{x_{k}{\eta(x)}{g^{p - 1}({Ax})}p}}} = 1}},\ldots\quad,P} & (62)\end{matrix}$The (r,s) element of the P×P dimensional, (k,l) block of Γ, denotedΓ^(k,l) is given by $\begin{matrix}\begin{matrix}{{\Gamma^{k,l}\left( {r,s} \right)} = {{rsE}\left\lbrack {\int_{R^{n}}^{\quad}{x_{k}{\eta(x)}{g^{r - 1}({Ax})}{\int_{R^{n}}^{\quad}{z_{l}\quad{\mathbb{d}{\eta(z)}}{g^{s - 1}({Az})}}}}} \right\rbrack}} \\{= {\sigma^{2}{rs}{\int_{R^{n}}^{\quad}{x_{k}x_{l}{g^{r + s - 2}({Ax})}}}}} \\{= {\sigma^{2}{rs}{A^{- 1}}{\int_{R^{n}}^{\quad}{{\left\lbrack {\left( A^{- 1} \right)_{k}y} \right\rbrack\left\lbrack {\left( A^{- 1} \right)_{l}y} \right\rbrack}{g^{r + s - 2}(y)}}}}} \\{= {\sigma^{2}{rs}{A^{- 1}}{\sum\limits_{i = 1}^{n}\quad{\sum\limits_{j = 1}^{n}\quad{q_{ki}q_{lj}{\int_{R^{n}}^{\quad}{y_{i}y_{j}{g^{r + s - 2}(y)}}}}}}}}\end{matrix} & (63)\end{matrix}$and (60) becomes in this special case $\begin{matrix}{u_{i,j} = \begin{pmatrix}{\int_{R^{n}}^{\quad}{y_{i}y_{j}1_{{supp}{({g{(y)}})}}}} & {2{\int_{R^{n}}^{\quad}{y_{i}y_{j}{g(y)}}}} & \ldots & {P{\int_{R^{n}}^{\quad}{y_{i}y_{j}{g^{P - 1}(y)}}}} \\\vdots & ⋰ & \quad & \vdots \\{P{\int_{R^{n}}^{\quad}{y_{i}y_{j}{g^{P - 1}(y)}}}} & {2P{\int_{R^{n}}^{\quad}{y_{i}y_{j}{g^{P}(y)}}}} & \ldots & {P^{2}{\int_{R^{n}}^{\quad}{y_{i}y_{j}{g^{{2P} - 2}(y)}}}}\end{pmatrix}} & (64)\end{matrix}$

9. Maximum Likelihood Estimation of the Affine Transformation Parameters

As we demonstrate in the next section, the LS algorithm for estimatingthe affine transformation parameters is accurate and computationallysimple as it requires only the solution of a set of linear equations. Inparticular there is no need for an iterative solution. Nevertheless, incases where the performance of this algorithm is not sufficient, it canserve to initialize the more complex maximum likelihood estimator (MLE)of the parameters, which we derive in this section.

Assuming the observation noise η(x) is Gaussian, we have under the highSNR assumption that ε^(g) is a zero mean Gaussian random vector withcovariance matrix Γ whose blocks are given in (59). Substituting (56)and (57) into (47) and rewriting it in a in a matrix form, we have underthe high SNR assumption that $\begin{matrix}{\begin{pmatrix}{\int_{R^{n}}^{\quad}{x_{k}{w_{1} \circ {h(x)}}}} \\\vdots \\{\int_{R^{n}}^{\quad}{x_{k}{w_{p} \circ (x)}}}\end{pmatrix} = {{{A^{- 1}}\begin{pmatrix}{\int_{R^{n}}^{\quad}{y_{1}{w_{1} \circ {g(y)}}}} & \ldots & {\int_{R^{n}}^{\quad}{y_{n}{w_{1} \circ {g(y)}}}} \\\vdots & ⋰ & \vdots \\{\int_{R^{n}}^{\quad}{y_{1}{w_{p} \circ {g(y)}}}} & \ldots & {\int_{R^{n}}^{\quad}{y_{n}{w_{p} \circ {g(y)}}}}\end{pmatrix}\begin{pmatrix}q_{k1} \\\vdots \\q_{kn}\end{pmatrix}} + \begin{pmatrix}{\overset{-}{ɛ}}_{1,k}^{g} \\\vdots \\{\overset{-}{ɛ}}_{P,k}^{g}\end{pmatrix}}} & (65)\end{matrix}$Rewriting (65) using matrix notations we haveh _(k) =|A ⁻¹ |Dq _(k)+ε_(k) ^(g).  (66)Let {overscore (h)}=[h₁ ^(T), h₂ ^(T), . . . , h_(n) ^(T)]^(T), q=[q₁^(T), q₂ ^(T),. . . , q_(n) ^(T)]^(T), and {overscore (D)}=I_(n){circleover (×)}D, where I_(n) is an n×n identity matrix. Hence thelog-likelihood function for the observation vector {overscore (h)} isgiven by $\begin{matrix}\begin{matrix}{{\log\quad{p\left( {\overset{\_}{h};q} \right)}} = {{{- \frac{m}{2}}{\log\left( {2\pi} \right)}} - {\frac{1}{2}\log\left( {\Gamma } \right)} -}} \\{\frac{1}{2}\left( {\overset{\_}{h} - {{A^{- 1}}\overset{\_}{D}q}} \right)^{T}{\Gamma^{- 1}\left( {\overset{\_}{h} - {{A^{- 1}}\overset{\_}{D}q}} \right)}}\end{matrix} & (67)\end{matrix}$

The MLE of the affine transformation parameters is found by maximizinglog p({overscore (h)};q) with respect to the model parameters {q_(i,j)}.Since this objective function is highly nonlinear in the problemparameters, the maximization problem cannot be solved analytically andwe must resort to numerical methods. In order to avoid the enormouscomputational burden of an exhaustive search, we use the followingtwo-step procedure. In the first stage we obtain a suboptimal initialestimate for the parameter vector q by using the algorithm described inSection 7. In the second stage we refine these initial estimates by aniterative numerical maximization of the log likelihood function. In ourexperiments we use the Broyden-Fletcher-Goldfarb-Shanno (BFGS)quasi-Newton optimization method. This algorithm requires evaluation ofthe first derivative of the objective function at each iteration:$\begin{matrix}\begin{matrix}{\frac{{\partial\log}\quad{p\left( {\overset{\_}{h};q} \right)}}{\partial q_{k,l}} = {{{- \frac{1}{2}}{tr}\left\{ {\Gamma^{- 1}\frac{\partial\Gamma}{\partial q_{k,l}}} \right\}} +}} \\{{\frac{\partial\left( {{A^{- 1}}\overset{\_}{D}q} \right)^{T}}{\partial q_{k,l}}{\Gamma^{- 1}\left( {\overset{\_}{h} - {{A^{- 1}}\overset{\_}{D}q}} \right)}} +} \\{\frac{1}{2}\left( {\overset{\_}{h} - {{A^{- 1}}\overset{\_}{D}q}} \right)^{T}\Gamma^{- 1}\frac{\partial\Gamma}{\partial q_{k,l}}{\Gamma^{- 1}\left( {\overset{\_}{h} - {{A^{- 1}}\overset{\_}{D}q}} \right)}^{T}}\end{matrix} & (68)\end{matrix}$such that $\begin{matrix}\begin{matrix}{\frac{\partial\left( {{A^{- 1}}\overset{\_}{D}q} \right)}{\partial q_{k,l}} = {{\frac{\partial{A^{- 1}}}{\partial q_{k,l}}\overset{\_}{D}q} + {{A^{- 1}}{\overset{\_}{D}}_{{{({k - 1})}n} + l}}}} \\{= {{\left( {- 1} \right)^{k + l}M_{k,l}\overset{\_}{D}q} + {{A^{- 1}}{\overset{\_}{D}}_{{{({k - 1})}n} + l}}}}\end{matrix} & (69)\end{matrix}$where M_(k,l) is the (k,l) minor of A⁻¹ and {overscore (D)}_(l) denotesthe l-th column of {overscore (D)}. Finally, $\begin{matrix}\begin{matrix}{\frac{\partial\Gamma^{s,t}}{\partial q_{k,l}} = {{\left( {- 1} \right)^{k + l}{A}M_{k,l}\Gamma^{s,t}} +}} \\{\sigma^{2}{A^{- 1}}\left\{ \begin{matrix}{{{\sum\limits_{j = 1}^{n}\quad{q_{t,j}u_{l,j}s}} \neq t},{s = k}} \\{{\sum\limits_{i = 1}^{n}\quad{q_{s,i}u_{i,l}}},{s \neq t},{t = k}} \\{{2{\sum\limits_{i = 1}^{n}\quad{q_{k,i}u_{l,i}}}},{s = {t = k}}} \\{0_{P \times P}\quad{otherwise}}\end{matrix} \right.}\end{matrix} & (70)\end{matrix}$where we have used the property that u_(i,j)=u_(j,i). Substitution of(70) and (69) into (68) provides an expression for the derivative of thelog-likelihood function in terms of the parameters of the inverse affinetransformation.

10. Numerical Examples

In this section we present some numerical examples to illustrate theoperation and robust performance of the proposed parameter estimationalgorithm, for a broad range of affine deformations.

The examples illustrates the operation of the proposed algorithm on acar image. The template image dimensions are 3100×1200. It is shown inthe bottom image of FIG. 4. The observed deformed image is shown in theupper image of the figure. The image coordinate system is [−1,1]×[−1,1].The translation vector is [−0.1,−0.14] and the translation estimationerror vector is [2·10⁻⁵,2.4·10⁻⁴]. The deforming transformation is givenby $A = \begin{pmatrix}{- 0.988} & 0.454 \\0.156 & {- 0.891}\end{pmatrix}$where the estimate obtained by the proposed procedure is$\hat{A} = {\begin{pmatrix}{- 0.987} & 0.453 \\0.156 & {- 0.891}\end{pmatrix}.}$To provide some measure for the magnitude of the error in estimating thedeformation matrix we list the value of A⁻¹À−I:${{A^{- 1}\hat{A}} - I} = {\begin{pmatrix}9.943 & {- 2.301} \\{- 1.298} & 0.228\end{pmatrix} \cdot 10^{- 4}}$Finally, the estimated deformation is applied to the original templatein order to obtain an estimate of the deformed object (middle image inFIG. 4) which can be compared with the deformed observation shown in theupper image.

The second example employs the same template as the first one and it isshown in the bottom image in FIG. 5. The observed deformed image isshown in the upper image of the figure. This image is both smaller inscale than in the first example and is observed with lower illumination,such that the illumination gain is a=0.58. The error in estimating thegain is à−a=0.726·10⁻⁷. The image coordinate system is [−1,1]×[−1,1].The translation vector is [−0.4,−0.5] and the translation estimationerror vector is [1.85·10⁻³,3.92·10⁻⁵]. The deforming transformation isgiven by $A = \begin{pmatrix}0.4854 & 0.3527 \\{- 0.3527} & 0.4854\end{pmatrix}$where the estimate obtained by the proposed procedure is${\hat{A} = \begin{pmatrix}0.4722 & 0.3626 \\{- 0.353} & 0.4858\end{pmatrix}},$while ${{A^{- 1}\hat{A}} - I} = {\begin{pmatrix}0.81 & {- 2.65} \\0.001 & {- 0.11}\end{pmatrix} \cdot 10^{- 2}}$

Finally, the estimated deformation is applied to the original templatein order to obtain an estimate of the deformed object (middle image inFIG. 5) which can be compared with the deformed observation shown in theupper image.

Part B: Estimation of Multi-Dimensional Homeomorphisms for ObjectRecognition in Noisy Environments

1. Introduction

This section is concerned with the general problem of automatic objectrecognition based on a set of known templates, where the deformationsare differentiable homeomorphisms having a continuous and differentiableinverse, such that the derivative of the inverse is also continuous.

The center of the proposed solution is a method that reduces the highdimensional problem of evaluating the orbit created by applying the setof all possible homeomorphic transformations in the group to thetemplate, into a problem of analyzing a function in a low dimensionalEuclidian space. In general, an explicit modeling of the homeomorphismsgroup is impossible. We therefore choose to solve this problem byfocusing on subsets of the homeomorphisms group which are also subsetsof vector spaces. This may be regarded as an approximation thehomeomorphism using polynomials, based on the denseness of thepolynomials in the space of continuous functions with compact support.In this setting, the problem of estimating the parametric model of thedeformation is solved by a linear system of equations in the lowdimensional Euclidian space.

More specifically, consider the problem given by h(x₁, . . . ,x_(n))=g(φ(x₁, . . . , x_(n))) where φ(x₁, . . . , x_(n))=(φ(x₁, . . . ,x_(n)), . . . , φ_(n)(x₁, . . . , x_(n))). In the problem settingconsidered here h and g are given while φ should be estimated. We nextsummarize the algorithmic solution for the problem of estimating thehomeomorphic deformation in the absence of observation noise. Tosimplify the notation and the accompanying discussion we present thesolution for the case where the observed signals are one-dimensional.The derivation for higher dimensions follows along similar lines.

2. Parametric Modeling and Estimation Homeomorphisms with aDifferentiable and Continuous Inverse

Let R be the one-dimensional Euclidean space, and let the correspondingmeasure be the standard Lebesgue measure. Consider the general casewhere h:X→Y and g:X→Y are bounded and measurable functions with compactsupport X⊂R, and where Y⊂R, such thath(x)=g(φ(x)).  (71)In the following we assume that G is the group of differentiablehomeomorphisms such that each element of G has a continuous anddifferentiable inverse, where the derivative of the inverse is alsocontinuous. Let C(X) denote the set of continuous real-valued functionsof X onto itself, where the norm is the standard L₂ norm. By the aboveassumption every φ⁻¹, (φ⁻¹)′εC(X). Since C(X) is a normed separablespace, there exists a countable set of basis functions {e_(i)}⊂C(X),such that for every φεG, $\begin{matrix}{{\left( \phi^{- 1} \right)^{\prime}(x)} = {\sum\limits_{i}{b_{i}{{e_{i}(x)}.}}}} & (72)\end{matrix}$

In other words, it is assumed that every element in the group and itsderivative can be represented as a convergent series of basis functionsof the separable space C(X). Our goal then, is to obtain the expansionof φ⁻¹(x) with respect to the basis functions {e_(i)(x)}. In practice,the series (72) is replaced by a finite sum, i.e., we have 1≦i≦m.

Let z=φ(x). Then φ⁻¹(z)=x, and hence(φ⁻¹)′(z)dz=dx  (73)Let {w_(p)}_(p=1) ^(P):Y→R be a set of continuous functions thatseparate the points of Y. Hence, these functions separate the points ofthe image of h (and g ). As we show next, these functions are employedto translate the identity relation (140) into a set of P equations:$\begin{matrix}\begin{matrix}{{\int_{- \infty}^{\infty}{{w_{p}\left( {h(x)} \right)}\quad{\mathbb{d}x}}} = {\int_{- \infty}^{\infty}{{w_{p}\left( {g\left( {\phi(x)} \right)} \right)}\quad{\mathbb{d}x}}}} \\{= {\int_{- \infty}^{\infty}{\left( {\phi^{- 1}(z)} \right)^{\prime}{w_{p}\left( {g(z)} \right)}\quad{\mathbb{d}z}}}} \\{= {\sum\limits_{i = 1}^{m}\quad{b_{i}{\int_{- \infty}^{\infty}{{e_{i}(x)}{w_{p}\left( {g(x)} \right)}\quad{\mathbb{d}x}}}}}} \\{{p = 1},\ldots\quad,P}\end{matrix} & (74)\end{matrix}$Rewriting (74) in a matrix form we have $\begin{matrix}{\begin{pmatrix}{\int{w_{1} \circ \quad h}} \\\vdots \\{\int{w_{p} \circ \quad h}}\end{pmatrix} = {\begin{pmatrix}{\int{e_{1}{w_{1} \circ \quad g}}} & \ldots & {\int{e_{m}{w_{1} \circ \quad g}}} \\\vdots & ⋰ & \vdots \\{\int{e_{1}{w_{p} \circ \quad g}}} & \ldots & {\int{e_{m}{w_{p} \circ \quad g}}}\end{pmatrix}\begin{pmatrix}b_{1} \\\vdots \\b_{m}\end{pmatrix}}} & (75)\end{matrix}$We thus have the following theorem:Theorem: The homeomorphism φ satisfying the parametric model defined in(71) is uniquely determined if the matrix $\begin{matrix}\begin{pmatrix}{\int{e_{1}{w_{1} \circ \quad g}}} & \ldots & {\int{e_{m}{w_{1} \circ \quad g}}} \\\vdots & ⋰ & \vdots \\{\int{e_{1}{w_{p} \circ \quad g}}} & \ldots & {\int{e_{m}{w_{p} \circ \quad g}}}\end{pmatrix} & (76)\end{matrix}$is full rank.Thus, provided that {w_(p)}_(p=1) ^(P), are chosen such that (76) isfull rank, the system (75) (in the absence of noise we take P=m) can besolved for the parameter vector [b₁, . . . , b_(m)]. It is clear that inthe absence of noise, any set of functions {w_(p}) _(p=1) ^(m) such that(76) is full rank is equally optimal. As we show next the situation isremarkably different in the presence of observation noise. Since thesolution for the case where the signals are of higher dimension followsfrom similar arguments we first present the method for estimatingmulti-dimensional deformations before addressing the solution to thecase where the observations are noisy.

3. Estimation of 2-D Deformations

Let S be some compact subset of R². Consider the general case whereh:S→R and g:S→R are bounded and measurable functions, such thath(x,y)=g(φ(x,y))  (77)whereφ(x,y)=(u,v)=(φ₁(x,y),φ₂(x,y))  (78)In the following we assume that G is the group of 2-D differentiablehomeomorphisms such that for each φεG, both φ₁ and φ₂ have a continuousand differentiable inverse, where the partial derivatives of the inverseare also continuous. More precisely, let C(S) denote the set ofcontinuous real-valued transformations from S to R. By the aboveassumption every$\phi,\frac{\partial\phi_{1}}{\partial x},{\frac{\partial\phi_{1}}{\partial y} \in {C(S)}},\phi_{2},\frac{\partial\phi_{2}}{\partial x},{\frac{\partial\phi_{2}}{\partial y} \in {{C(S)}.}}$Thus, an equivalent form of (77) isg(u,v)=h(φ₁ ⁻¹(u,v),φ₂ ⁻¹(u,v))  (79)Since C(S) is a normed separable spaces, we assume that there exists acountable set of basis functions {e_(i)(x,y)}⊂C(S), such that for everyφεG, $\begin{matrix}{{{\phi_{1}\left( {x,y} \right)} = {\sum\limits_{i}{a_{i}{e_{i}\left( {x,y} \right)}}}}{{\phi_{2}\left( {x,y} \right)} = {\sum\limits_{i}{b_{i}{e_{i}\left( {x,y} \right)}}}}} & (80)\end{matrix}$In other words, it is assumed that for every element φ in the group,each of the functions φ₁, φ₂ and their partial derivatives can berepresented as convergent series of basis functions of the separablespace C(X). Let J(x,y) denote the Jacobian of the transformation, i.e.,$\begin{matrix}{{J\left( {x,y} \right)} = {\begin{matrix}\frac{\partial\phi_{1}}{\partial x} & \frac{\partial\phi_{1}}{\partial y} \\\frac{\partial\phi_{2}}{\partial x} & \frac{\partial\phi_{2}}{\partial y}\end{matrix}}} & (81)\end{matrix}$Substituting the parametric representation (80) into (81), we obtain aparametric representation of the Jacobian in terms of the parameters{a_(i)}, {b_(i)}.

Let {w_(p)}_(p=1) ^(P):R→R be a set of continuous functions thatseparate the points of R. Since these functions separate the points ofthe image of h (and g) they are employed to translate the identityrelation (77) into a set of P linearly independent equations that can besolved first for the parametric representation of the Jacobian, and in asecond stage for the parameters {a_(i)}, {b_(i)}. Thus, $\begin{matrix}{{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{w_{p}\left( {g\left( {u,v} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{w_{p}\left( {h\quad\left( {{\phi_{1}^{- 1}\left( {u,v} \right)},{\phi_{2}^{- 1}\left( {u,v} \right)}} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}}} & (82)\end{matrix}$By substituting J(x,y) in the system (82) with it parametric model, andsolving the system, an estimate of the Jacobin is obtained. Once theJacobian of the coordinate transformation is available, the parametricmodels of the homeomorphisms can be determined using the followingapproach: $\begin{matrix}{{{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{uw}_{p}\left( {g\left( {u,v} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{uw}_{p}\left( {h\left( {{\phi_{1}^{- 1}\left( {u,v} \right)},{\phi_{2}^{- 1}\left( {u,v} \right)}} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{\phi_{1}\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{\sum\limits_{i}\quad{a_{i}{e_{i}\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}}} = {\sum\limits_{i}\quad{a_{i}{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{e_{i}\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}}}}}}}{and}} & (83) \\{{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{vw}_{p}\left( {g\left( {u,v} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{vw}_{p}\left( {h\left( {{\phi_{1}^{- 1}\left( {u,v} \right)},{\phi_{2}^{- 1}\left( {u,v} \right)}} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{\phi_{2}\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}} = {{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{\sum\limits_{i}{b_{i}{e_{i}\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}}} = {\sum\limits_{i}{b_{i}{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{J\left( {x,y} \right)}{e_{i}\left( {x,y} \right)}{w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}}}}}}} & (84)\end{matrix}$The motivation for choosing this approach is that it enables us todirectly “translate” the parametric model of the coordinatetransformation into a set of linear equations. Example: Let {e_(i)} bethe set of 2-D polynomials functions on X×Y. The polynomials are furtherassumed to be of some finite order n in each of their variables. Hence,we have $\begin{matrix}{{{\phi_{1}\left( {x,y} \right)} = {{\sum\limits_{i = 0}^{n}\quad{\sum\limits_{j = 0}^{n}\quad{a_{i,j}x^{i}y^{j}}}} = {x_{n}^{T}{Ay}_{n}}}}{{\phi_{2}\left( {x,y} \right)} = {{\sum\limits_{i = 0}^{n}\quad{\sum\limits_{j = 0}^{n}\quad{b_{i,j}x^{i}y^{j}}}} = {x_{n}^{T}{By}_{n}}}}} & (85)\end{matrix}$where we define x_(n)[=1, x, . . . , x^(n)]^(T), y_(n)=[1, y, . . . ,y^(n)]^(T). Hence, $\begin{matrix}{{J\left( {x,y} \right)} = {{\begin{matrix}{{\overset{.}{x}}_{n}^{T}{Ay}_{n}} & {x_{n}^{T}A{\overset{.}{y}}_{n}} \\{{\overset{.}{x}}_{n}^{T}{Ay}_{n}} & {x_{n}^{T}A{\overset{.}{y}}_{n}}\end{matrix}} = {x_{2n}^{T}{Cy}_{2n}}}} & (86)\end{matrix}$and (82) yields the system $\begin{matrix}{{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{uw}_{p}\left( {g\left( {u,v} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{w_{p}\left( {h\left( {x,y} \right)} \right)}x_{2n}^{T}{CY}_{2n}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}} & (87)\end{matrix}$Since the composing set of functions is known, the system of P equations(87) can be solved by a least squares solution for the 4n²≦P unknownelements of C, and hence for J(x,y). Once the Jacobian of the coordinatetransformation is available, the parametric models of the homeomorphismsare determined by substituting their polynomial models into (83) and(84): $\begin{matrix}{{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{uw}_{p}\left( {g\left( {u,v} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{\left( {X_{2n}^{T}{CY}_{2n}} \right)\left( {X_{n}^{T}{AY}_{n}} \right){w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}\quad{and}}}}} & (88) \\{{\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{{{vw}_{p}\left( {g\left( {u,v} \right)} \right)}\quad{\mathbb{d}u}\quad{\mathbb{d}v}}}} = {\int_{- \infty}^{\infty}{\int_{- \infty}^{\infty}{\left( {X_{2n}^{T}{CY}_{2n}} \right)\left( {X_{n}^{T}{BY}_{n}} \right){w_{p}\left( {h\left( {x,y} \right)} \right)}\quad{\mathbb{d}x}\quad{\mathbb{d}y}}}}} & (89)\end{matrix}$As an illustrative example consider the image shown in FIG. 6. The imageon the right shows a severely deformed version of the original image.The deformation is composed of both homeomorphic and affinetransformations. The deformation is estimated using only the two images:the original and the deformed, by employing a two--dimensional versionof the algorithm briefly introduced earlier. Having estimated thedeformation, its inverse is applied to the deformed image on the right,to produce the registered image on the left, which is virtuallyidentical to the original image (not shown here). This exampledemonstrates that image registration is possible despite the severedeformation, and can be implemented as a solution to a linear system ofequations.

4. Observations Subject to Additive Noise: The Least Squares Solution

In the presence of noise the observed data is given byh(x)=g(φ(x))+η(x).  (90)Assuming that the noise has a zero mean, and that its higher orderstatistics are known, we first address questions related to issue of theoptimal choice of the set {w_(p)} for each template function g. We beginby adapting the solution derived in the previous section for thedeterministic case, to a least squares solution for the modelparameters. In the presence of noise the basic equation (74) becomes$\begin{matrix}\begin{matrix}{{\int_{- \infty}^{\infty}{{w_{p}\left( {h(x)} \right)}\quad{\mathbb{d}x}}} = {\int_{- \infty}^{\infty}{{w_{p}\left\lbrack {{g\left( {\phi(x)} \right)} + {\eta(x)}} \right\rbrack}\quad{\mathbb{d}x}}}} \\{= {\int_{- \infty}^{\infty}{{w_{p}\left\lbrack {{g(z)} + {\eta\left( {\phi^{- 1}(z)} \right)}} \right\rbrack}\left( \phi^{- 1} \right)^{\prime}(z)\quad{\mathbb{d}z}}}} \\{= {{\int_{- \infty}^{\infty}{{w_{p}\left( {g(z)} \right)}\left( \phi^{- 1} \right)^{\prime}(z)\quad{\mathbb{d}z}}} + ɛ_{p}^{g}}}\end{matrix} & (91)\end{matrix}$where we define the random variable $\begin{matrix}\begin{matrix}{ɛ_{p}^{g} = {\int_{- \infty}^{\infty}{\left\{ {{w_{p}\left\lbrack {{g(z)} + {\eta\left( {\phi^{- 1}(z)} \right)}} \right\rbrack} - {w_{p}\left( {g(z)} \right)}} \right\}\left( \phi^{- 1} \right)^{\prime}(z)\quad{\mathbb{d}z}}}} \\{= {\int_{- \infty}^{\infty}\left\{ {{w_{p}\left\lbrack {{g\left( {\phi(x)} \right)} + {\eta(x)}} \right\rbrack} - {{w_{p}\left( {g\left( {\phi(x)} \right)} \right\}}\quad{\mathbb{d}x}}} \right.}}\end{matrix} & (92)\end{matrix}$Substituting (72) into (91), we obtain the linear system of equations$\begin{matrix}{{{\int_{- \infty}^{\infty}{{w_{p}\left( {h(x)} \right)}\quad{\mathbb{d}x}}} = {{{\sum\limits_{i}{b_{i}{\int_{- \infty}^{\infty}{{e_{i}(x)}{w_{p}\left\lbrack {g(x)} \right\rbrack}\quad{\mathbb{d}x}}}}} + {ɛ_{p}^{g}\quad p}} = 1}},\ldots\quad,P} & (93)\end{matrix}$The system (93) represents a linear regression problem where the noisesequence {ε_(p) ^(g)} is non-stationary since its statistical momentsdepend on the choice of w_(p) for each p. The regressors are functionsof w_(p) and the template g, and hence are deterministic. Provided thatthe sequence of composition functions {w_(p)}_(p=1) ^(P), is chosen suchthat the resulting regressors matrix is full rank, the system (93) issolved by a linear least squares method such that the l₂ norm of thenoise vector is minimized. The dependence of the noise sequence {ε_(p)^(g)} on the choice of w_(p) suggests that different choices of thecomposition sequence {w_(p)}_(p=1) _(P) may provide different solutions.We shall be first interested in systems such that for each p, the linearconstraint imposed by w_(p) is unbiased (and thus the “effective noise”that corresponds to each w_(p), is zero mean).

5. Construction of Unbiased Linear Constraints

Consider the case where we choose w_(p)(x)=Σ_(k)α_(k) ^(P)x^(k), and theadditive noise is white, Gaussian with zero mean and variance σ². Wenext evaluate the mean term, Eε_(p) ^(g), of the “effective noise”, sothat a correction term can be introduced such that the non-zero-meanerror term ε_(p) ^(g) in (92) is replaced by a zero mean error term. Tosimplify the notation we will take advantage of linear structure ofw_(p)(x), and analyze only the case where w_(p)(x)=x^(p) and thegeneralization is straightforward. Thus, in this case $\begin{matrix}\begin{matrix}{ɛ_{p}^{g} = {\int_{- \infty}^{\infty}{\left\{ {\left\lbrack {{g(z)} + {\eta\left( {\phi^{- 1}(z)} \right)}} \right\rbrack^{p} - {g^{p}(z)}} \right\}\left( \phi^{- 1} \right)^{\prime}(z)\quad{\mathbb{d}z}}}} \\{= {\int_{- \infty}^{\infty}{\sum\limits_{j = 1}^{p}\quad{\begin{pmatrix}p \\j\end{pmatrix}{g^{p - j}(z)}{\eta^{j}\left( {\phi^{- 1}(z)} \right)}\left( \phi^{- 1} \right)^{\prime}(z)\quad{\mathbb{d}z}}}}}\end{matrix} & (94)\end{matrix}$Since the noise is Gaussian with zero mean, all terms involving oddorder moments of the noise vanish and hence $\begin{matrix}\begin{matrix}{{Eɛ}_{p}^{g} = {\sum\limits_{j = 1}^{\lbrack\frac{p}{2}\rbrack}\quad{\begin{pmatrix}p \\{2j}\end{pmatrix}{E\left\lbrack {\eta^{2j}\left( {\phi^{- 1}(z)} \right)} \right\rbrack}{\int_{- \infty}^{\infty}{{g^{p - {2j}}(z)}\left( \phi^{- 1} \right)^{\prime}(z)\quad{\mathbb{d}z}}}}}} \\{= {\sum\limits_{i}{b_{i}{\sum\limits_{j = 1}^{\lbrack\frac{p}{2}\rbrack}\quad{\begin{pmatrix}p \\{2j}\end{pmatrix}{E\left\lbrack {\eta^{2j}\left( {\phi^{- 1}(z)} \right)} \right\rbrack}{\int_{- \infty}^{\infty}{{g^{p - {2j}}(z)}{e_{i}(z)}\quad{\mathbb{d}z}}}}}}}}\end{matrix} & (95)\end{matrix}$and E[η^(2j)(φ⁻¹(z))] is a constant which is a function of only theindex 2j of the even order moment, and of σ².

Hence, for the case where w_(p)(x)=x^(p), the zero-mean-noise version of(93) becomes $\begin{matrix}{{\int_{- \infty}^{\infty}{{h^{p}(x)}\quad{\mathbb{d}x}}} = {{\sum\limits_{i}{b_{i}\left\{ {{\int_{- \infty}^{\infty}{{e_{i}(z)}{g^{p}(z)}\quad{\mathbb{d}z}}} + {\sum\limits_{j = 1}^{\lbrack\frac{p}{2}\rbrack}\quad{\begin{pmatrix}p \\{2j}\end{pmatrix}{E\left\lbrack {\eta^{2j}\left( {\phi^{- 1}(z)} \right)} \right\rbrack}{\int_{- \infty}^{\infty}{{g^{p - {2j}}(z)}{e_{i}(z)}\quad{\mathbb{d}z}}}}}} \right\}}} + {\overset{\sim}{ɛ}}_{p}^{g}}} & (96)\end{matrix}$where {tilde over (ε)}_(p) ^(g) is a zero mean random variable.

Thus the system (96) represents a different linear regression problemwhere the observation noise is non-stationary, but with a zero mean. Theregressors are functions of w_(p), the template g, and the knownstatistics of the noise. Hence the regressors are deterministic.Provided that the resulting regressors matrix is full rank, the system(96) is solved by a linear least squares method such that Σ_(p=1)^(p)|{tilde over (ε)}_(p) ^(g)|² is minimized.

6. Analysis of the High SNR Case

In this section we analyze the proposed method assuming w_(p)(x)=x^(p),when it is assumed that the signal to noise ratio is high.

To achieve improved numerical stability in the presence of noise, in thefollowing, we consider the normalized version of the system (93), i.e.,$\begin{matrix}{{{\frac{1}{\int_{- \infty}^{\infty}{{g^{p}(x)}\quad{\mathbb{d}x}}}{\int_{- \infty}^{\infty}{{h^{p}(x)}\quad{\mathbb{d}x}}}} = {{{\sum\limits_{i}{b_{i}\frac{1}{\int_{- \infty}^{\infty}{{g^{p}(x)}\quad{\mathbb{d}x}}}{\int_{- \infty}^{\infty}{{{e_{i}(x)}\left\lbrack {g^{p}\left( {\phi(x)} \right)} \right\rbrack}\quad{\mathbb{d}x}}}}} + {{\overset{\_}{ɛ}}_{p}^{g}\quad p}} = 1}},\ldots\quad,P} & (97)\end{matrix}$Since h(x)=g(φ(x))+η(x) we have under the high SNR assumption that thecontribution of high noise powers can be neglected, i.e.,h ^(p)(x)≈g ^(p)(φ(x))+pη(x)g ^(p−1)(φ(x))  (98)Hence, the error term in (92) is approximated under the high SNRassumption by $\begin{matrix}\begin{matrix}{{\overset{\_}{ɛ}}_{p}^{g} = {\frac{p}{\int_{- \infty}^{\infty}{{g^{p}(x)}\quad{\mathbb{d}x}}}{\int_{- \infty}^{\infty}{{\eta(x)}{g^{p - 1}\left( {\phi(x)} \right)}\quad{\mathbb{d}x}}}}} \\{{= {{c_{p}{\int_{- \infty}^{\infty}{{\eta(x)}{g^{p - 1}\left( {\phi(x)} \right)}\quad{\mathbb{d}{xp}}}}} = 1}},\ldots\quad,P}\end{matrix} & (99)\end{matrix}$where we define c_(p) = p/∫_(−∞)^(∞)g^(p)(x)  𝕕x.Clearly, E[{tilde over (ε)}_(p) ^(g)]=0, p=1, . . . , P. We nextevaluate the error covariances of the system, under the high SNRassumption. Let ε^(g)=[{tilde over (ε)}₁ ^(g), . . . , {tilde over(ε)}_(P) ^(g)]^(T) and Γ=E[ε^(g)(ε^(g))^(T)]. Thus, the (k,l) element ofF is given by $\begin{matrix}\begin{matrix}{\Gamma_{k,l} = {c_{k}c_{l}{E\left\lbrack {\int_{- \infty}^{\infty}{{\eta(x)}{g^{k - 1}\left( {\phi(x)} \right)}\quad{\mathbb{d}x}{\int_{- \infty}^{\infty}{{\eta(y)}{g^{l - 1}\left( {\phi(y)} \right)}\quad{\mathbb{d}y}}}}} \right\rbrack}}} \\{= {c_{k}c_{l}\sigma^{2}{\int_{- \infty}^{\infty}{{g^{k + l - 2}\left( {\phi(x)} \right)}\quad{\mathbb{d}x}}}}} \\{= {c_{k}c_{l}\sigma^{2}{\int_{- \infty}^{\infty}{\left( {\phi^{- 1}(z)} \right)^{\prime}{g(z)}^{k + l - 2}\quad{\mathbb{d}z}}}}} \\{= {c_{k}c_{l}\sigma^{2}{\sum\limits_{i}{b_{i}{\int_{- \infty}^{\infty}{{e_{i}(z)}{g(z)}^{k + l - 2}\quad{\mathbb{d}z}}}}}}}\end{matrix} & (100)\end{matrix}$Rewriting (100) in matrix form we have $\begin{matrix}{\Gamma = {\sigma^{2}{\sum\limits_{i}{b_{i}u_{i}\quad{where}}}}} & (101) \\{u_{i} = \begin{pmatrix}{c_{1}^{2}{\int_{- \infty}^{\infty}{{e_{i}(z)}\quad{\mathbb{d}z}}}} & \ldots & {c_{1}c_{P}{\int_{- \infty}^{\infty}{{e_{i}(z)}{g(z)}^{P - 1}\quad{\mathbb{d}z}}}} \\\vdots & ⋰ & \vdots \\{c_{1}c_{P}{\int_{- \infty}^{\infty}{{e_{i}(z)}\quad{\mathbb{d}z}}}} & \ldots & {c_{P}^{2}{\int_{- \infty}^{\infty}{{e_{i}(z)}{g(z)}^{{2P} - 2}\quad{\mathbb{d}z}}}}\end{pmatrix}} & (102)\end{matrix}$

7. Maximum Likelihood Estimation of the Homeomorphism

As we illustrate below, the LS algorithm for estimating thehomeomorphism parameters is accurate and computationally simple as itrequires only the solution of a set of linear equations. In particularthere is no need for an iterative solution. Nevertheless, in cases wherethe performance of this algorithm is not sufficient, it can serve toinitialize the more complex maximum likelihood estimator (MLE) of theparameters, which we derive in this section.

Assuming the observation noise n(x) is Gaussian, we have under the highSNR assumption that ε^(g) is a zero mean Gaussian random vector withcovariance matrix Γ given in (101). Rewriting (93) in a matrix form,assuming the high SNR assumption we haveh=Gb+ε ^(g)  (103)where h = [∫_(−∞)^(∞)w₁(h(x))  𝕕x, …  , ∫_(−∞)^(∞)w_(P)(h(x))  𝕕x]^(T),and G is a P×m matrix such thatG_(k, l) = ∫_(−∞)^(∞)e_(k)(x)w_(l)[g(x)]  𝕕x.Hence the log-likelihood function for the observation vector h is givenby $\begin{matrix}{{\log\quad{p\left( {h;b} \right)}} = {{{- \frac{m}{2}}{\log\left( {2\pi} \right)}} - {\frac{1}{2}{\log\left( {\Gamma } \right)}} - {\frac{1}{2}\left( {h - {Gb}} \right)^{T}{\Gamma^{- 1}\left( {h - {Gb}} \right)}}}} & (104)\end{matrix}$The MLE of the field parameters is found by maximizing log p(h;b) withrespect to the model parameters b. Since this objective function ishighly nonlinear in the problem parameters, the maximization problemcannot be solved analytically and we must resort to numerical methods.In order to avoid the enormous computational burden of an exhaustivesearch, we use the following two-step procedure. In the first stage weobtain a suboptimal initial estimate for the parameter vector b by usingthe algorithm described in Section 16. In the second stage we refinethese initial estimates by an iterative numerical maximization of thelog likelihood function. In our experiments we use theBroyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton optimizationmethod. This algorithm requires evaluation of the first derivative ofthe objective function at each iteration: $\begin{matrix}{\frac{{\partial\log}\quad{p\left( {y;b} \right)}}{\partial b_{i}} = {{{- \frac{1}{2}}{tr}\left\{ {\Gamma^{- 1}\frac{\partial\Gamma}{\partial\theta_{i}}} \right\}} + {G_{i}^{T}{\Gamma^{- 1}\left( {h - {Gb}} \right)}} + {\frac{1}{2}\left( {h - {Gb}} \right)^{T}\Gamma^{- 1}\frac{\partial\Gamma}{\partial\theta_{i}}{\Gamma^{- 1}\left( {h - {Gb}} \right)}^{T}}}} & (105)\end{matrix}$where b_(i) is the ith element of b, G_(i) denotes the ith column of G,and $\begin{matrix}{\frac{\partial\Gamma}{\partial\theta_{i}}\sigma^{2}u_{i}} & (106)\end{matrix}$

8. Numerical Examples

In this section we present numerical examples to illustrate theoperation and performance of the proposed model, and parameterestimation algorithm. In the first example we consider the case wherethe template function is given by g(x)=sin(πx)+0.2 sin(2πx)−0.3sin(3πx)+0.4 sin(5πx) and the deformation isφ(x)=0.6x−0.4x²−0.8x³+1.6x⁴. The observed signal is h(x)=g(φ(x))+n(x),where the noise is normally distributed with zero mean and variance of0.04. The number of available samples of both the template and the noisyobservation is 10000, so that the effects of both integration errors aswell as of low sampling rates (and the resulting need for interpolatingthe data) are negligible. We illustrate the performance of the proposedsolution using Monte Carlo simulations. The experimental results arebased on 500 independent realizations of the observed signal. Theestimator employed is the least-squares estimator. The top-left plot ofFIG. 7 shows the template and a single noisy observation. The top-rightplot shows the deforming function and the mean value of its estimates.The two plots on the bottom depict the bias and the mean squared errorin estimating the deformation as a function of x (the deformation takesplace along the x-axis).

The next example illustrates the operation of the proposed algorithm onan image of a real object. The image dimensions are 1170×1750. The topimage in FIG. 8 depicts the original image of the aircraft, which isalso employed as the template. In order to be able to evaluate theperformance of the method the image of the object is then deformed, anda zero mean Gaussian observation noise is added—to obtain the simulatednoisy observation of the aircraft. See the middle image. The deformingfunction (which takes place only along the x-axis) is depicted using asolid line in FIG. 9 along with the estimate (dotted line) obtained byapplying the proposed solution to the noisy observation shown in FIG. 8.Finally, the estimated deformation is applied to the original templatein order to obtain an estimate of the deformed object (lower image inFIG. 8) which can be compared with the deformed noisy object shown inthe middle image.

This invention addresses the problem of finding the homeomorphictransformation relating a given observation on a planar object with somepre-chosen template of this object. The direct approach for estimatingthe transformation is to apply each of the deformations in the group tothe template in a search for the deformed template that matches theobservation. The disclosed method employs a set of non-linear operatorsto replace this high dimensional problem by an equivalent linearproblem, expressed in terms of the unknown transformation modelparameters. Thus, the problem of finding the parametric model of thedeformation is mapped, by this set on non-linear operators, into a setof linear equations which is then solved for the transformationparameters. The proposed solution has been further extended to includethe case where the deformation relating the observed signature of theobject and the template, is composed of both a geometric deformation dueto the transformation of the coordinate system, and a constantillumination change. The proposed solution is unique and accurate and isapplicable to essentially any continuous transformation regardless ofthe magnitude of the deformation.

The embodiments described above are given as examples of the method ofthe present invention. Those skilled in the art will be aware of variousmodifications of the present invention. These variations are to beincluded in the spirit of the present invention.

1. A method for comparing views of an object comprising: (a) creating aset of known templates of objects; (b) observing one of said objects inan unknown transformation; and (c) deforming at least one of thetemplates to correlate the observed object with one of the knowntemplates.
 2. The method of claim 1, wherein the step of deformingreduces a high dimensional calculation to a linear parametricestimation.
 3. The method of claim 2, wherein the method is used formotion analysis, image registration, or object recognition.
 4. Themethod of claim 2, wherein a set of non-linear operators is used toreplace the high dimensional calculation with the linear parametricestimation.
 5. The method of claim 4, wherein the transformation isaffine and the estimation is exact.